Skip navigation.
Home | About Us | Contact Us
Rick Scott, Governor
Florida Department of Corrections, Secretary Michael D. Crews

Florida Department of Corrections
Timothy H. Cannon, Interim Secretary

Data and Methods: Statistical Analysis

Studies typically report several factors that contribute to recidivism rates for many offender populations, including state prison inmates. These include such personal characteristics as age, gender, race, length and severity of criminal history, education and skill levels, and conduct in prison. Other sentencing factors such as length of incarceration and post-release supervision affect recidivism rates as well.

Definition of Factors that Influence Recidivism Rates

The factors reported here were selected using the following criteria:

  • strong evidence that the factor influences recidivism,
  • the factor is measured independently from specific programs or services,
  • valid data are readily available to measure the factor,
  • data used are expected to remain reliable over time.

This should ensure that future reports, analyses, and evaluations of Department programs and activities can depend on this report's data, sources, and methods.

Factor data are grouped into categories primarily to facilitate presentation of the basic relationship each has to recidivism rates. The categories are based on a combination of these criteria:

  • inmate sub-populations for which data is often requested,
  • inmate groupings typically reported in other Department documents,
  • equal distribution of the cases, where possible.

These criteria for categorizing the factors serve two goals: to show each factor's basic relationship, if any, to recidivism rates; and to provide data that is needed to respond to typical internal and external information requests the Department receives.

Gender is used to define the subpopulations analyzed here, yielding separate information about male and female inmate recidivism. As such, though male and female recidivism rates differ substantially, gender is not treated here as a factor in the technical sense. For more information about how gender influences recidivism rates, see Recidivism Rate Curves.

The age at release factor is calculated in the ordinary manner as the integer age on the day of release from prison. The race factor is categorized as black or non-black for analysis because 98.3% of inmates self-report as "white" or "black." The Hispanic factor is based on an inmate self-report as such on either race or ethnicity. Hispanics comprise 85.5% of all inmates not reporting "white" or "black" on race, and only 4.6% of those who report "white" or "black."

The post-release supervision factor is defined as those inmates who appear on a Department supervision caseload after prison release without having committed a new offense. The months in prison factor is defined as the number of partial months between admission and release from prison, including temporary out-time (e.g., local jail time). The disciplinary reports factor is a count of the total number of reports each inmate received between admission and release. Only final disciplinary reports, those unchallenged by inmates or upheld after being challenged, are included. The release custody factor is defined as high ("close" custody) or low ("minimum" or "community" custody) just prior to prison release. The community custody level is combined with minimum custody because it was only implemented in December 1999, late in the release cohort period. The education level factor is defined as the last grade equivalent score on the Test for Adult Basic Education prior to release. A TABE score is available regardless of whether or not an inmate participated in academic programs while in prison. Therefore, the measure is considered an indication of an inmate's academic skill level only, not necessarily a reflection of prison academic program performance.

Several criminal history factors are used. The prior recidivism factor is defined as the number of times before the current prison release that an inmate was released from prison and committed a new crime. This is the same reoffense recidivism measure used in this analysis counted back through time prior to the most recent imprisonment.

The most serious offense factor is defined as the highest category of offense ever committed in an inmate's criminal career from these five types, beginning with the most serious:

  • homicide
  • sexual or lewd behavior
  • robbery
  • other violent (assault, battery, stalking, etc.)
  • burglary.

The factors for property offenses, drug offenses, and weapons offenses are counts of the total number of each type of offense in each inmate's career. Property offenses include fraud and theft or damage of property, but do not include burglary. Drug offenses include trafficking, manufacture, dealing, distribution, and possession of illegal drugs and illegal activity with prescription drugs. Weapons offenses include illegal sale, use, and possession of weapons, but not use of a weapon to commit another crime—that is, an armed robbery is considered as a robbery rather than a weapons offense.

In the statistical analyses, each of the following factors were included as a dichotomous (2 category) variable: with one category indicating the characteristic's presence, and the other indicating its absence:

  • black
  • Hispanic
  • supervision
  • high custody
  • low custody
  • homicide as most serious offense
  • sex/lewdness as most serious offense
  • robbery as most serious offense
  • other violent offense as most serious offense
  • burglary as most serious.

The remaining factors below were included in analyses as continuous measures, using the full range of variable values:

  • age at release
  • months in prison
  • disciplinary reports
  • education grade level
  • prior recidivism events
  • property offenses (females)
  • drug offenses
  • weapons offenses.

For more information about how these factors appear in male and female inmates, see Release Cohort. For more information on these factors, see the Technical Appendix.

Analyses of Factors' Influence on Recidivism Rates

To assess the combined effect of the factors on recidivism, a standard cross validation method was employed to test the predictive value of the factors. The male and female cohorts were each separated into six subsets, one each for reoffense or reimprisonment at three follow-up periods: 18, 36, and 60 months after release. Each subset was randomly divided in half. One half of each subset was used to develop a predictive statistical model of recidivism. The other half of the subset data was used to validate the model. Using all appropriate factors as covariates, a logistic regression model was evaluated to predict failure (recidivism) for each gender, recidivism measure, and follow-up period. In these models the recidivism measure was defined as dichotomous (either failure or success) within the specified follow-up period, and only releases having the complete follow-up period were included. The model information (regression coefficients) from each prediction half was applied to factor values in the validation half of each subset. The resulting combined effects were used to classify each case as a recidivist or non-recidivist.

The percentage of cases expected to be classified correctly by chance is known from the percentages of recidivists in each prediction and validation half of each subset. The percentage classified correctly by the model, using the factors, is compared to the percentage expected to be correct by chance. The model's improvement over chance expectation in correctly classifying cases indicates the value of the combined factors for predicting recidivism. For information on the predictive power of these combined factors, see Factors Affecting Recidivism Rates.

To measure the relative influence of each factor compared to that of other factors, a proportional hazard regression model was estimated for each recidivism measure—reoffense and reimprisonment—for males and females separately. These models were evaluated using a stepwise selection method in which each variable is entered into the model in order beginning with the factor most related to the recidivism measure. At each step as the next most related factor is added to the model, any previously entered factor may be removed if its relationship to recidivism no longer remains significant given the influence of the other factors in the model. Any factors dropped out of the model may re-enter the model at a later step. As a result, the order in which the factors are entered in and retained in the model reflects the strength of their influence on recidivism relative to other factors. The statistical technique rather than the analyst's judgment determines based on the data whether and when factors are entered, retained, or dropped. Each factor was ranked in influence based on the final order in which it entered and survived in the model. Those factors that did not enter or survive in the model were ranked based on the strength of their small residual relationships to recidivism. The ranking of non-significant factors is not reliable.

To measure the general effect of each factor on recidivism, a second proportional hazard regression model was estimated for each recidivism measure for males and females separately. Unlike the models used to assess relative influence, these models included all factors as covariates. For indicator variables, relative risks with lower and upper bounds under 95% confidence limits are reported based on hazard ratios from the regression model. For continuous variables, average relative effects over the entire 60-month follow-up period are reported for each factor based on the model regression coefficients. Each factor's hazard ratio is defined as the exponentiated regression coefficient; so conversely, the natural log of the hazard ratio is the regression coefficient. For each continuous variable, upper and lower bounds (with 95% confidence) around the coefficient were estimated from the upper and lower bounds for the hazard ratio by taking the natural log of each.

To illustrate the general effect each factor has on recidivism rates, survival functions for categories on each factor were estimated from the proportional hazard regression model. The complement of the survival estimates (1-St) is used to graph the cumulative failure rates (recidivism rate curves). Categories used for continuous variables (e.g., age) were selected for convenience to facilitate charting the factor effects. These rates are adjusted for other factors by estimating the rates with the values of other covariates in the model set at the average value for the baseline group. These rates reflect the partial effect of each factor on recidivism after controlling for other factors' influences. They are reported to convey visually the net relationship between each factor and recidivism rates.

For information on the relative importance, the general effects of factors, and charts illustrating the partial effects of each individual factor while controlling for all others, see Factors Affecting Recidivism Rates.

To report actual recidivism rates broken down by factors, the recidivism rates within bounds at 95% confidence are estimated for categories of each factor. Unlike the general net effects of individual factors displayed in the recidivism rate curves, these actual rates were estimated using a basic Kaplan-Meier survival analysis model. These actual rates do not adjust for the effects of other factors on recidivism. Instead, they simply report the expected recidivism rates of inmates in the particular category on each factor. For further explanation and to view these actual recidivism rates, see Tables of Actual Rates by Factor - Unadjusted.