# «Item type text; Dissertation-Reproduction (electronic) Authors DUNN, THURMAN STANLEY. Publisher The University of Arizona. Rights Copyright © is ...»

Detection Quotient (DQ) = Threat Value (T) times Probability of Detecting at Least One Occurrence of Fraud (P) times Converse of Occurrence Rate (C) Or, DQ = TPC = (82.2) = 78.0 Or, DQ (.95) (.999) The technique which will be used to ensure probabilities of detection for given rates of occurrence is the Discovery Sampling or Exploratory Sampling technique.

Discovery (Exploratory) Sampling The purpose of discovery samp 1i ng is to disc lose ev i dence of some activity, usually an irregularity, within a system. The type of evidence required need be only one example of such a serious deviation or irregularity. If found in the test, this one occurrence is sufficient to precipitate vigorous action such as a broader test or even a detailed examination. Thus, if the sample disclosed one example of a fraudulent transaction being added by a terminal operator, all transactions entered by that operator for the past year or more may be reviewed to determine the extent of the fraud.

The first thing that must be recognized is that discovery sampling does not provide a means for guaranteeing, with some small sampling, that the "needle in the haystack" type of case will be found.

For example, if only one instance of fraud exists in a field of one million records, no sample short of virtual complete examination can give any reasonable assurance that the case will be found. Arkin (1967) suggests that, due to the sheer mass of records to be examined, that even a 100 percent check might not disclose such a unique instance. It should be noted that if the "needle in the haystack" case represents a small dollar value it may not be worth pursuing. If the one million records cited above represent $100 million, a single case of fraud for $1,000 represents only.00001 of the total dollars transacted. Except in very unusual circumstances, this amount would would not cause much concern among higher management, in fact would probably be regarded as insignificant. On the other hand, if the fraud where for $10 million there would be much cause for concern. A solution to this problem which will be discussed further later is the method of stratification. Using this approach dollar thresholds may be established above which all or some higher percentage of transactions are examined.

The mathematical formula for calculating the probability of at least one occurrence of an event given a particular field size is shown

**below (Arkin 1967):**

Cd C - d N 1 __---"O_-'n~___ Pr =

sample size for different organizations. A bank, built on the trust and confidence of its clients, might require a much more extensive sample than a supplier of electronic components even though the expected losses in the latter firm may be greater.

Another factor which might have a significant influence on the sample size is the nature of the transactions. For example, assume that in one case the dollar value of transactions was distributed as

**follows based on the above field size of 10,000:**

Percentage of Dollar Transactions Value 1% (100 Transactions) $100,000 or Greater SO% (S,OOO Transactions) $10,000 to $100,000 19% (1,900 Transactions) $10,000 or less In this example a perpetrator could, with only a few manipulations extract a considerable sum without attracting too much attention due to the size of the transactions.

** Now assume that the 10,000 transactions are distributed in the following dollar categories:**

Dollar Percentage of Transactions Value 5% (500 Transactions) $500 - $1,000 95% (9,500 Transactions) $500 - or less In the latter distribution it would take numerous transaction manipulations within the normal dollar values shown to extract a s i zeab le sum. Thus, in the latter example a sample based on an occurrence rate of.5% (50 out of 10,000) might be adequate. In the first example, a sample might have to be based on an occurrence rate of.1% (10 out of 10,000) or less to provide adequate protection.

For the first example above the stratification technique mentioned earlier might be desirable. For example, a total examination of all 100 transactions exceeding $100,000 might be performed. Then, discovery sampling used for the remaining 9,900 transactions. Still another approach might be to examine all transactions over $100,000.

Then, establish two different discovery sampling schemes for the next two categories. Thus, for transactions between $10,000 and $100,000 a probability of at least 95% of detecting at least one occurrence may be required if the rate of occurrence is.2%. For the third category, or those transactions under $10,000, a probability of 90% of detecting at least one occurrence with a rate of occurrence of.5% might be adequate.

It should be pOinted out that a sample drawn for discovery sampling can also be interpreted to give an indication as to the degree of fraudule.nt activity at a given probabi 1ity level. To illustrate, if a discovery sample is completed without disclosing an occurrence of fraud, this may be interpreted to mean that there is a high probability that the rate of occurrence is less than that used for the sample size selection.

Consider, for instance, the example above which was based on a field size of 10,000 and a rate of occurrence of.2% (20 out of 10,000). For this example a sample size of 2,000 transactions gives a probability of 98.9% of disclosing at least one example if there are 20 in the field. If none is found, the result may be interpreted to mean that there is a 98.7% probability that the rate is less than.2% or 20 out of 10,000.

Sample Size - Unlimited Resources As indicated above, the sample size required is a direct function of several factors, one of which is management's aversiveness to risk, as measured by the probability of fraud in their systems.

Many managers faced with the question "how aversive are you to fraud?" would probably initially indicate a total aversiveness to fraud by stating that they would tolerate no fraud in their systems. However, after a brief explanation of the effort required to examine every transaction or change within their systems, most managers would probably agree that it is not feasible to eliminate all risk within a system. If, after acquiring a good understanding of the efforts involved, management is still totally aversive to fraud, the sample size is set at 100 percent and necessary resources applied. Given this situation, which is probably highly unusual, the analysis which follows in this section and in Chapter 7 is unnecessary since there is not a resource allocation problem. By assuming unlimited resources and setting the sample size at 100 percent, the detection quotient (DQ) would be maximized for both the individual threats identified in Chapter 3 (or Chapter 5) and for the total system.

Recall that the DQ value for each threat is determined by the = TPC where T = the threat value from Figure 12; P = the formula DQ probability of detecting at least one occurrence of fraud and C= the converse of the occurrence rate on which the sample was based.

By taking a 100 percent sample the probability of finding at least one occurrence, even if only one occurrence exists, should be close to 100 percent.

** Recalling the previous example from Figure 12 which was based on transactions added by data entry/terminal operators, the following DQ would be applicable where the rate of occurrence is.0001 (lout of 10,000):**

= TPC DQ = (82.2) (1.00) (.9999) ::: 82.2 If, as Arkin (1967) suggested, a total sample would not ensure a 100 percent probabi 1ity of detecting a single occurrence when only one exists in a large field, the above formula would have to be adjusted by an inefficiency factor of some sort. If, for example, it could be shown that for the above situation with a 100 percent sample there

**is only a 98 percent probability of discovering a single occurrence because of inefficiencies in examining procedures, the DQ would become:**

DQ = (82.2) (.98) (.9999) = 80.5 The above inefficiency factor would be difficult to quantify.

Since a total sample is typically not feasible for today's systems, the point becomes somewhat academic. The total sample is used 'in this section to illustrate the detection quotient rather than suggest it as a viable alternative.

As shown above, inefficiencies in investigating have a decreasing affect on the DQ value. However, in either of the above cases, the DQ has been maximized since 100 percent of the transactions were examined. It should be clear by now that the maximum possible DQ values are the corresponding threat values shown in the threat matrix (Figure 12). Thus, for the threat matrix in Figure 12 the maximum possible DQs correspond to the threat values shown in the matrix cells.

The maximum total DQ for the matrix is 271 (the total for all threat values in Figure 12).

Sample Size - Limited Resources The case described above probably represents an almost nonexistent situation in larger systems. The cost of conducting a 100 percent examination of all transactions and changes in a large system would be very high if such an examination is even possible. Given that it is possible, the cost effectiveness of such an approach would be doubtful in most systems. More typically, the resources available to dedicate to the examination of transactions or changes in a system are only a fraction of those required to conduct a 100 percent examination.

In the latter situation the primary objective is to maximize the ability to detect fraud, as measured by the system DQ, given the limitations on available resources. The computer fraud detection resource optimization model in Chapter 7 is designed to provide this maximum or near maximum DQ value.

The model in Chapter 7 is based on the premise that some limited number of hours is available to allocate to computer fraud detection. For example, assume that only 300 hours are available per month in a system that would require 5,000 hours per month for a 100 percent examination. The objective of the resource optimization model in this case would be to achieve the maximum or near maximum DQ for the system given the constraint of 300 hours.

As indicated previously, management aversiveness varies from one organization to another based on the type of business, distribution of transactions or changes in their systems, their own tolerance for risk, etc. Thus, it is possible that the DQ value and associated probabilities for given rates of occurrence of the different fraud schemes generated by the model in optimizing the use of the 300 available hours will not be as high as management would like. In this case management has two options. One is to allocate additional resources, thus raising the 300 hour constraint to some higher amount and rerunning the model. This process can be repeated until an acceptable combination of protection and resource allocation is reached. Management's second option is to accept the values associated with the 300 hours as the best that can be achieved, albeit not the most desirable, if additional hours cannot be allocated.

An alternative to the model was considered for establishing sample sizes and resulting probabilities for detecting fraud' for the different schemes. This alternative is to determine, through interviews with management, the sample sizes for the various fraud schemes and perpetrator types, based on their aversivness to fraud.

Thus, management would, through some series of questions and answers, reveal their required probability for detecting at least one occurrence of fraud in transactions added by data entry/terminal operators, for example, if certain rates of occurrence exist. These probabilities and associated rates of occurrence would determine the sample sizes, resulting OQ values and required resources.

This approach was rejected for four reasons: First, management would be placed in the uncomfortable position of having to implicitly condone a certain level of fraud by agreeing to something less than a 100 percent probability of detection; second, it is difficult to convert fraud aversiveness to terms of probabilities and rates of occurrence; third, unless resources are highly flexible it would take numerous iterations (and a great deal of management's time) to achieve compromised sets of probabilities and rates of occurrence that could be achieved within available resources; and fourth, given the level of available resources, it would be highly unlikely that management's probabi 1ities and rates of occurrence would result in a system DQ as high as the model's.

In order to demonstrate the relationships between probabilities of detection, rates of occurrence, sample sizes and system DQ values, a sample application is presented.

The sample system contains the following volumes (field sizes)

**for the various scheme types each month:**

Monthly Scheme Field Type size Transactions Added 30,000 Transactions Altered 10,000 Transactions Deleted 5,000 2,000 Fi le Changes Program Changes 1,000 Operation Cycles 2,000 Now assume that management has decided that, if fraud exists with a rate of occurrence of.5% or greater, a probability of 90 percent must be ensured of detecting at least one such case. The sample sizes from Arkin's tables for each of the scheme types are shown in Figure 14.

As indicated in column 3 of Figure 14, as the population size increases, with the rate of occurrence held constant, the sample size needed to provide the same probability of discovering at least one occurrence of fraud increases at a decreasing rate. For populations above 10,000 the sample size remains almost constant for given rates of occurre.nce. It may be noted that for a population of 200,000, a sample size of SOO provides a 90 percent probability of detecting at least one occurrence of fraud when the occurrence rate is.S% - the same protection shown in Figure 14 for a population of 10,000.

** Figure 14. Sample Size Example By combining the probabilities of detection, the rates of occurrence and systems threat values, DQ values may be derived for each of the systems threats and for the system as a whole.**

This has been accomplished in Figure lS for the threat values derived in Chapter 3.

The threat values from Figure 13 in Chapter 3, a 90% probability of detection and an occurrence rate of.S% from above, are used to = TPC.