«Evaluation of Three Court-Mandated Family Violence Interventions: FVEP, EXPLORE, and EVOLVE Stephen M. Cox, Ph.D. Professor Pierre M. Rivolta, Ph.D. ...»
Use of Arrests. There is no consensus among criminal justice and criminology scholars regarding the most accurate measure of recidivism. Program evaluation research in criminal justice typically uses new arrests, new convictions, or new incarceration sentences to measure recidivism or program success. We chose to use new arrests because we feel this measure best represents offender behavior, in that, an offender acted in some manner as to invoke a police response, although not every person who is arrested actually committed a crime. By contrast, the use of new convictions would likely underestimate offender recidivism given that over 80% of selection bias. However, because eligible individuals may be denied participation in the program when assigned to the control group(s), this method violates ethical standards, which makes it hard to implement and unpopular.
Also note: because the traditional method of matching utilizes only a few specific variables for matching, it tends to lead to biased estimates of treatment effect as differences between groups may still exist on other variables not used for matching.
Court-Mandated Family Violence Interventions Central Connecticut State University Connecticut family violence charges receive a nolle or are dismissed prior to court arraignment (Connecticut Statistical Analysis Center, 2007). We acknowledge that neither measure is perfect but believe that operationalizing recidivism using police arrests is more accurate in this context.
Calculation of Effect Sizes
The overarching goal of this study was to calculate effect sizes for use in a cost-benefit analysis model for family violence programs. Effect sizes provide estimates of how much a program is able to change the outcome of its participants compared to a similar group of individuals who did not attend the program (Ferguson, 2009). They are useful, in that, they allow for the comparison of effects across multiple programs to determine whether some programs are more or less effective than others.
While there are many types of effect sizes, we present two of these in this study: d-cox transformations and odds ratios. We used d-cox transformations since they have been found to be the most appropriate for evaluative studies where the outcomes are dichotomous (i.e., have only two possible outcomes such as re-arrested or not re-arrested) (see Sánchez-Meca, ChacónMoscoso, and Marin-Martínez, 2003). The d-cox effect sizes approximate the average differences between the two study groups.
The D-cox transformation is an estimated effect size derived by dividing the natural logarithm of the odds ratio by the constant of 1.65 (Pe represents the percentage of family violence program completers who were not rearrested, and Pc represents the percentage of family violence offenders who did not attend the program) (see Drake, Aos, & Miller, 2009; SánchezMeca et al., 2003; Washington State Institute for Public Policy, 2013; Washington State Institute for Public Policy, 2009).
In the above equation, O1E and O1C represent the number of offenders who were not rearrested in the treatment and comparison groups while O2E and O2C represent the number of offenders who were rearrested.
We also reported odds ratios along with the d-cox effect sizes because they provide a more meaningful measure for explaining the magnitude of effects. Odds ratios show the odds of comparison group offenders who were rearrested in proportion to the program participants who were rearrested and have a straightforward interpretation (Ferguson, 2009). For example, an odds Court-Mandated Family Violence Interventions Central Connecticut State University ratio of 2.50 signifies that offenders in the comparison group are 2.5 times more likely to be rearrested than offenders completing a family violence program.
Plan of Analysis The next sections describe the procedures employed to conduct the research and presents the results of the analyses. Specifically, we present three distinct results sections, one for each of the three programs (i.e., FVEP, EXPLORE, and EVOLVE). Each section was divided into five subsections. The first subsection described the procedure employed to match the treatment and comparison samples. The second subsection presented a comparison of the matched sample on a series of covariates. The third subsection shows the comparison of offenders who completed the programs to those that dropped out. The fourth subsection presented the program outcomes. The fourth subsection provided the three types of effect sizes described above.
Court-Mandated Family Violence Interventions Central Connecticut State University
The Family Violence Education Program is a 9-week pretrial program that meets once per week for 1.5 hours. The purpose is to educate defendants (male or female) on how violence affects relationships and to provide them with basic interpersonal skills to have violence-free relationships.
Matching Process for FVEP
We began our analysis of the FVEP by identifying all family violence offenders from 2010 who were referred to the FVEP or who were eligible to attend the FVEP but did not. We first removed defendants who had their cases disposed during the follow-up period and were sentenced to prison since they would have had no opportunity to be rearrested. This step resulted in a data file consisting of 5,030 offenders (3,981 FVEP participants and 1,049 offenders who were recommended to attend the FVEP by CSSD’s Family Services Division based on the DVSI-R assessment and/or were court-ordered to attend the FVEP by the presiding judge but did not attend. There are several reasons why offenders may be referred and court-ordered to the FVEP but not attend. For instance, a defense attorney may negotiate with the States Attorney (i.e., prosecutor) to not require program attendance in exchange for a plea bargain of reduced charges, reduced sentences, or some other type of treatment program such as counseling, anger management, or individualized therapy. In other cases a pretrial defendant may fulfill other court-ordered conditions and the presiding judge will not require FVEP program attendance.
Unfortunately, these details of the pretrial process are not recorded in a systematic manner so we were unable to determine how often these situations occur.
The FVEP participants group and the no-FVEP comparison groups were merged to create one large data set, hereafter referred to as the FVEP merged sample. This merged sample consisted of 5,030 cases: the 3,981 FVEP participants and the 1,049 no-FVEP cases. Data in each group were then checked for missing values. A total of 101 cases had missing court or DVSI-R information and were from the data set. After deleting these cases, the data set was reduced to 3,891 FVEP participants and 1,038 non-program comparison cases (total N = 4,929).
Following the sampling procedure detailed above, propensity score matching (PSM) was employed to minimize selection bias and ensure the subjects in the comparison group were similar to treated subjects on nine covariates (i.e., age at arrest, gender, racial/ethnic group membership, court, DVSI-R total score, DVSI-R risk level, DVSI-R risk to victim, DVSI-R dual arrest, number of prior arrests, and number of prior family violence arrests).
It is important to note that the number of individuals in the FVEP participants’ group (i.e., treatment group) was much larger than the number of individuals in the no-FVEP group (i.e., comparison group), with a ratio of approximately 3:1. In such situations, the matching procedure can be performed “with replacement, in which a single unit in the control group can be reused to be matched to more than one unit in the treatment group.” One advantage of this method is that it “reduces the overall imbalance between the two groups, because the closest possible unit in the control group can be used for matching, even if this unit also has been used for a different match” (Thoemmes, 2012, p.8). A disadvantage, however, is that sometimes
only a small number of units can be repeatedly matched to units in the treatment group. As a result, the estimate of effects of treatment will become reliant on these repeatedly used units.
An alternative matching method in this situation would consist of doing a n:1 match, in which n units in one group are matched to one unit in the other group. This is what we did for the FVEP. The process began by aggregating the groups using a custom SPSS plug-in (see Thoemmes, 2012). The program used logistic regression as an estimation algorithm to calculate the propensity score for each of the subjects in the dataset using the nine covariates specified above. And then, using the nearest neighbor matching algorithm, the program matched three subjects in the treatment group (i.e., FVEP participants) to one subject in the comparison pool (i.e., no-FVEP) to which the propensity score most closely matched the treatment subject’s propensity score (i.e., 3:1 nearest neighbor matching), without replacement. We chose this method so that we could include as many FVEP participants as possible. A 1:1 match would have eliminated 2,842 FVEP participants (71% of all FVEP participants) and would have greatly limited the generalizability of the study.
During the propensity score matching procedure, cases in the FVEP participant group that were not matched to a no-program comparison group subject (n=777) were removed from the data set. As a result, there were 1,038 no-program comparison subjects matched to 3,114 FVEP participants.11 Following the matching procedure, the balance of all observed covariates as well as interaction among all covariates were examined. No covariates exhibited a large imbalance (|d|.25).12 The relative multivariate imbalance measure L1 was larger in the unmatched sample (0.951) than in the matched sample (0.950). These measures indicate that the matching procedure successfully improved balance between groups. In addition, diagnostic plots were produced and show that covariate balance was greatly improved in the matched sample. A selection of plots is presented hereafter.
Figure 1 shows the distribution of propensity scores of FVEP participants (“treated”) and the no-FVEP comparison group subjects (“control”) before and after matching with overlaid kernel density estimate.
In the description of outcomes (see next section), a comparison between the 3,114 matched FVEP participants and the 777 FVEP participants prior to matching will be presented.
Note: The overall balance test can only be implemented for 1:1 matching without replacement.
Court-Mandated Family Violence Interventions Central Connecticut State University Figure 1. Distribution of Propensity Scores for the FVEP Study Groups Figure 2 below shows the line plot of standardized differences before and after matching.
Court-Mandated Family Violence Interventions Central Connecticut State University Figure 2. Line Plot of Standardized Matching Differences Figure 3 below shows histograms with overlaid kernel density estimates of standardized differences before and after matching.
Figure 3. Histograms of Standardized Matching Differences Comparison of FVEP Matched Study Groups Following the matching procedures it was necessary to determine differences between the two study groups for race/ethnicity, gender, DVSI-R scores, age, and criminal history.
Table 3 assesses study group differences by race/ethnicity and gender. Both study groups were made up Court-Mandated Family Violence Interventions Central Connecticut State University of approximately 46% whites, 28% African-Americans, and 25% Hispanics. In addition, 71% of both study groups were males. The chi-square test shows there were no statistically significant differences between the FVEP comparison and the FVEP treatment groups for race/ethnicity or gender.
Table 4 presents the DVSI-R comparison between the two study groups. For this analysis, we looked at differences in the DVSI-R risk level categories (a low risk category indicates these offenders are at a low risk of reoffending) and the assessment item that identifies whether the offender poses an immediate risk to the victim. Again, the chi-square test shows there were no statistically significant DVSI-R differences between the two study groups.
The final test between the study groups assessed average differences between them for age at their 2010 family violence arrest, their DVSI-R total risk score (the higher the score the riskier the offender), number of arrests prior to their 2010 family violence arrest, and number of family violence arrests prior to their 2010 family violence arrest. The average age for both study Court-Mandated Family Violence Interventions Central Connecticut State University groups was 33 years old, their DVSI-R total risk score was 8.10, both groups had an average of three prior arrests and less than one prior family violence arrests. The t-test analyses show there were no statistically significant differences between these two groups.
Differences Between FVEP Program Completers and Non-Completers This section explores differences between family violence offenders in the FVEP participants’ study group who successfully completed the FVEP to those participants who do not complete it. Of the offenders referred to the FVEP, 84% (2,624) completed the program and 16% (490) did not. The completion rate was consistent with internal Judicial Branch-CSSD FVEP reports. Table 6 presents the race/ethnicity and gender of the completers and non-completers.
White offenders had higher FVEP completion rates (87%) than Hispanic (83%) and AfricanAmerican offenders (80%). Race/ethnicity was statistically related to FVEP completion. There was not a statistical relationship between gender and program completion with a slightly higher percentage of males completing the FVEP compared to females (85% to 83%).