# «Item type text; Dissertation-Reproduction (electronic) Authors DUNN, THURMAN STANLEY. Publisher The University of Arizona. Rights Copyright © is ...»

sampling is to disclose evidence of some activity, usually some irregularity within a system. Further, the disclosure can be assured with statistically defensible confidence levels if relatively small samples from the population are examined.

For the solution to the combinatorial dilemma the activity or irregularity will be defined to be a combination which, based on some measurable criteria, is superior to all others observed to that point.

possible to state with a confidence level corresponding with the sample size that the combination which could not be beaten in a total sample is within some specific percentile of possible combinations. For example, by taking the appropriate number of samples of a given size, it is possible to assure with a 99.9 percent confidence level that a given combination, unbeatable for one total sample, is within the top.2 percent of all possible solution, etc.

Once the desired confidence level, percentile and corresponding sample size have been determined, the remainder of the logic in Figure 32 is oriented to seeking a near optimum combination as defined by the parameters from block one of the flowchart.

As shown in the second block in Figure 32, a "Combination to Beat" is selected. Although any combination could be selected for this purpose, a suggested approach is to run through one complete random sample taking the best combination from that sample as the starting "Combination to Beat". Thus, the starting "Combination to Beat" should be a relatively tough combination to beat, possibly cutting down on the total number of samples that must be taken before a combination is found which is unbeatable when compared to each combination in one complete random sample. The reason for this is that each time a combination from the current random sample is found to be superior to the existing "Combination to Beat", this combination is established as the new "Combination to Beat" and a new random sample is selected.

Beginning with an easy "Combination to Beatll will probably result in several random samples being chosen in order to converge to a combination equal or superior to that chosen by evaluating one complete random sample and choosing the best combination from that sample.

Once a starting "Combination to Beat" has been selected, a random sample of combinations is taken as shown in block three. Each combination from this sample is then compared to the "Combination to Beat" until either a superior combination from the sample is found or all combinations from the sample have been compared without finding one superior to the "Combination to Beat". If, as shown in blocks five and six of Figure 32, a combination from the sample is superior to the "Combination to Beat", it is established as the new "Combination to Beat" and a new random sample is selected against which to compare it.

As shown in blocks seven and eight, additional combinations from the sample are compared to the existing "Combination to Beat" until either a combination from the sample beats the existing "Combination to Beat" or a complete random sample has been compared without finding one which is superior.

Once a complete random sample has been compared to the eXisting "Combination to Beat" without finding any combinations which beat it, a solution has been found as shown in block nine. Since a complete sample did not reveal any combinations superior to the existing "Combination to Beat" it may be stated with the confidence level from block one that the unbeatable combination is within the corresponding percentile of possible solutions. Another way of interpret i ng the results is that the nondiscovery of a superior combination in one complete random sample ensures that the rate of of superior occ~rrence combinations is no greater than some given percentage with a given confidence level. Thus, for example, it might be said as in the example cited earlier that a confidence level of 99.8 percent can be assured that a solution is within the top 99.8 percent of possible solutions or, conversely, with the same confidence level that the rate of occurrence of superior combinations is no greater than.002.

The process descri bed above forms the bas is for the Resource Optimization Model in Chapter 7, which may be used to produce a near optimum utilization of resources in the detection of computer fraud for computer systems.

## CHAPTER 7

** RESOURCE OPTIMIZATION MODEL**

The objective of the Resource Optimization Model in this chapter is to optimize the utilization of available resources in the detection of computer fraud as measured by the Detection Quotient described in Chapter 4. Recall that the Detection Quotient for a given system is the sum of the Detection Quotients for each threat in the system. As illustrated in Chapter 4, Figure 15, the Detection Quotient for each threat is a function of three factors; The numerical threat value of each threat (T); the probability of detecting at least one occurrence of fraud (P); and the converse of the rate of occurrence (C) • In Figure 15 the Detection Quotients (DQ's) for the 21 major threats identified during the research for this thesis, covered in Chapter 3, are calculated us.ing the formula DQ = TPC. For illustrative purposes in Figure 15, a IIpli value of.9 and a "C" value of.995 are assumed for each of the 21 threats. The underlying assumption is that investigative resources are available to ensure, for each of the 21 threats, a 90 percent probability of detecting at least one occurrence of fraud if the rate of occurrence is at least.5 percent.

"e" values may vary from one threat In actuality, the IIpli and to another. Further, given the same expenditure level of investigative resourc"es, the detection level, as measured by the DQ value, may vary "C II combination to another.

considerably from one lip II and The configuration in Figure 15, whereby all IIpll and IIC II values are.9 and.995 respectively, is only one of a very large number of possible combinations. This combination, chosen arbitrarily for illustrative purposes, may represent a very poor util ization of resources since any number of the other possible lip II and "C" combinations may be superior with resulting DQ's greater than the DQ value of 242.8 shown in Figure 15.

The magnitude of the possible number of combinations of IIpll and IIC II values precludes examination of them all, as will be demonstrated later in this chapter. The Combinatorial Dilemma referred to in Chapter 6 applies, since it is not feasible to examine all possible IIpli and "C II combinations in order to ensure the optimum use of detection resources.

The purpose of the Resource Optimization Model is to seek a near optimum combination of lip II and IIC II combinations while only examining a fraction of the possible combinations for given levels of investigative resources.

In order that the analysis may be more easily relatable to specific systems, the hypothethical system in Chapter 5, "Specific Assessment ll, Threat wi 11 be used to demonstrate. the Resource Optimization Model. While the threats in Figure 15 could be used to demonstrate the model, it should be more meaningful to use the hypothetical IIspecific ll system in Chapter 5 since risk assessment and computer fraud detection must ultimately be applied to specific

**systems:**

Mathematical Statement of the Problem The problem of optimizing the utilization of available resources in the detection of computer fraud is similar to the general problem of mathematical programming. The general problem in mathematical programming is to find the values of some variables which will optimize (i.e., maximize or minimize) the value of the objective function subject to a set of constraints. The mathematical programming

**problem can be formulated in the following general form (Kwak 1973):**

Maximize (or Minimize) n F=. CjXj J=l Subject to n j=l aijxj bi(for j=1,2--,m) Xj 0 (for j=1,2,--,n) and Where F = Value of the objective function which measures the effectiveness of the decision.

Xj = Variables that are subject to the control of the decislon maker.

Cj = Unit profit contribution of a product or unit cost of an input which is known.

= Production (or technical) coefficients that are aij known.

= Available productive resources in limited supply bi

should be reiterated that, even with a one hundred percent investigation, it may not be possible to find the "needle in the haystack" (e.g. one instance of fraud in 10 million transactions), because of inefficiencies in investigative techniques. Thus, "P" can typically only approach, rather than equal one. For "C" to equal one, the rate of occurrence of fraud would have to be zero. This becomes nonsensical since, for a given "P" value, a "C" value of one suggests that, with the probabi 1ity of "P", at least one occurrence of fraud wi 11 be detected if the rate of occurrence is zero. Th i sis, of course, impossible for all "P" values except zero since there is no probability of detecting an occurrence if none exists.

As indicated previously, the one hundred percent investigation is typically not feasible. In fact, the typical organization will probably have, or be willing to expend, only a small percentage of the resources required for this type of investigation. In this environment of limited resources, the problem of setting UP" and "C" values for the various systems threats and the associated levels of investigative resources becomes a complex resource allocation problem. As will be shown, the "Combinatorial Dilemma" applies since a phenomenally large number of possible combinations of IIp li and "C" values must typically be considered.

First, the objective function for the threat matrix in Figure 30 will be developed to demonstrate the process. The first step is to number the threats in the threat matrix. Cells with threat values of zero are skipped since they represent zero threat combinations. This ',' has been accomplished in Figure 33.

The objective function is to maximize the total detection quotient for the system represented by the threat matrix in Figure 30,

**or mathematically:**

n Maximize D = j=1 TjPjCj

**Referring to Figure 36, the expanded equation is:**

** Figure 33. Threat Matrix with Threats Numbered Where Xj = Variables that are subject to the control of the decislon maker.**

= aij Production (or technical) coefficients that are known.

= Available bi productive resources in limited supply.

For the resource optimization model the constraint equation may

**be stated as follows:**

f;i aijPjCj bi Pj 0 (For j=1,2,---,n) Cj 0 (For j=1,2,---,n) Where Pj and Cj = Variables that are subject to the control of the decision maker, specifically, the probability of detecting at least one occurrence of fraud (P) and the converse of the rate of occurrence (C). (It should be noted that the decision maker is not selecting the converse of the rate of occurrence of fraud but rather the converse of the rate of occurrence for which he or she would require the selected "PII value. For examp 1e, if the dec is i on maker requ i res a 99 percent probabi 1ity of detecting at least one occurrence of fraud if the rate of occurrence of fraud is.002, the "P" value selected is.99 and the "G" value selected is 1-.002 or.998.

bi = Available productive resources in limited supply, specifically, the amount of available investigative resources (hours').

aij = Production (or technical) coefficients for the resource optimization model equates to the hours which are required to accomplish the "P" and "C" levels for a given threat. For example, referring to Tl in the Threat Matrix in Figure 36 (Transaction Added- Data Entry/Terminal Operator) assume that it has been determined that an average of 15 minutes are required to val idate each transaction. In this example aij equals.25 (hours).

n = Total number of systems threats that have been identified for a system with a T value in excess of zero. (For the example in Figure 33 n = 17).

Decision Variables The decision variables in a mathematical programming problem are those variables whose values are to be chosen. In the resource optimization model these variables are "P" and IIG" for each threat in the system.

The "P" value or probabi 1ity of detecting at least one case of computer fraud as a decision variable is rather straightforward. Given that there is a positive relationship between "P" and the level of

34. Combl. natorl.a1 Example " Figure " possible "C" values.

The number of combinations (N) of UP" and "C" values for the

**threats in Figure 33 may now be calculated as follows:**

N = {YZ)X = {1,000. 10,000)17 = 10,000,00017 There is little to be gained from carrying out the mathematics of 10 million to the seventeenth power. It goes without saying that the Combinatorial Dilemma applies. It should be noted that even if only two threats had been identified there would still be ten million squared, or one hundred trillion combinations.

and "C" values.

An English Statement of the Resource Optimization Problem The objective of the Resource Optimization Model is to allocate resources to the various identified threats in such a way that the detection capability for the system is maximized as measured by the Detection Quotient for the system. The underlying assumption is that invest i gat i ve resources are 1imited and systems threats must compete for them.

The resources allocated to the various threats is driven by the and "C" values for each threat. As demonstrated in the section on ll IIp the Applicability of the Combinatorial Dilemma, the number of possible combinations of lip II and "C" values, typically precludes a total evaluation, certainly for the hypothetical system represented in Figure 33.

The Resource Allocation Solution The solution to the resource allocation problem is based on the