Two-Part Statistical Analysis of NIFM Occurrence and Mark Informativeness

Author

Maria Cuellar

Published

March 17, 2026

1 Purpose

This report starts the substantive statistical analysis for the Stoney NIJ project using the most recent grouped data currently present in this project directory. The main modeling idea is a two-part or hurdle-style analysis.

That framing is natural here because the scientifically relevant outcome has two stages:

  1. many tested rounds yield no observed NIFM at all
  2. among rounds with a NIFM, the mark can then vary in informativeness

A single model for raw minutiae counts would mix those two processes together. For this project, it is more interpretable to separate:

  1. NIFM occurrence: does a round yield any NIFM?
  2. mark informativeness: conditional on a NIFM being present, how strong is that mark?

This report uses grouped counts, so the results should be treated as category-level analyses. They are useful for identifying broad patterns, but they do not yet account for clustering within guns, donors, or loads. With round-level data, the same two-part logic could be implemented more formally as an individual-level hurdle model.

2 Data

The current analysis uses data/nifm-2-12-25.csv. Each row is a handgun type / caliber category. The Rounds Tested column gives the total number of rounds in that category. The remaining count columns give the number of observed NIFMs with each minutiae count.

Now load the grouped data and create the derived variables used throughout the rest of the analysis.

Code
options(stringsAsFactors = FALSE)

data_candidates <- c(
  "data/nifm-2-12-25.csv",
  "../data/nifm-2-12-25.csv"
)

data_path <- data_candidates[file.exists(data_candidates)][1]
if (is.na(data_path)) {
  stop("Could not find nifm-2-12-25.csv in data/ or ../data/")
}

dat <- read.csv(data_path, check.names = FALSE)
minutiae_values <- as.numeric(names(dat)[5:ncol(dat)])
count_matrix <- as.matrix(dat[5:ncol(dat)])

dat$group <- paste0(dat$Type, dat$Caliber)
dat$total_nifm <- rowSums(count_matrix)
dat$no_nifm <- dat[["Rounds Tested"]] - dat$total_nifm
dat$occurrence_rate <- dat$total_nifm / dat[["Rounds Tested"]]
dat$mean_minutiae <- rowSums(sweep(count_matrix, 2, minutiae_values, `*`)) / dat$total_nifm
dat$nifm_ge_6 <- rowSums(count_matrix[, minutiae_values >= 6, drop = FALSE])
dat$nifm_ge_8 <- rowSums(count_matrix[, minutiae_values >= 8, drop = FALSE])
dat$nifm_ge_10 <- rowSums(count_matrix[, minutiae_values >= 10, drop = FALSE])
dat$p_ge_6 <- dat$nifm_ge_6 / dat$total_nifm
dat$p_ge_8 <- dat$nifm_ge_8 / dat$total_nifm
dat$p_ge_10 <- dat$nifm_ge_10 / dat$total_nifm

summary_tab <- dat[, c(
  "group", "Type", "Caliber", "Rounds Tested", "Donors", "total_nifm",
  "occurrence_rate", "mean_minutiae", "p_ge_6", "p_ge_8", "p_ge_10"
)]
summary_tab$occurrence_rate <- round(summary_tab$occurrence_rate, 3)
summary_tab$mean_minutiae <- round(summary_tab$mean_minutiae, 2)
summary_tab$p_ge_6 <- round(summary_tab$p_ge_6, 3)
summary_tab$p_ge_8 <- round(summary_tab$p_ge_8, 3)
summary_tab$p_ge_10 <- round(summary_tab$p_ge_10, 3)

knitr::kable(summary_tab, caption = "Grouped summary statistics by handgun type and caliber.")
Grouped summary statistics by handgun type and caliber.
group Type Caliber Rounds Tested Donors total_nifm occurrence_rate mean_minutiae p_ge_6 p_ge_8 p_ge_10
A22 A 22 198 15 50 0.253 5.30 0.400 0.060 0.020
A32 A 32 105 12 60 0.571 5.63 0.417 0.133 0.067
A38 A 38 348 20 101 0.290 5.44 0.347 0.119 0.020
A45 A 45 178 16 50 0.281 5.16 0.260 0.100 0.000
R22 R 22 225 23 52 0.231 5.29 0.327 0.115 0.038
R38 R 38 107 20 79 0.738 5.85 0.342 0.203 0.114
R45 R 45 102 12 86 0.843 5.79 0.337 0.221 0.081

3 Why A Two-Part Model Makes Sense

At the round level, the outcome of interest is effectively zero-inflated in the ordinary-language sense: many rounds have zero observed NIFMs, while positive rounds have a right-skewed positive outcome.

For this dataset, a hurdle interpretation is more natural than a classic zero-inflated Poisson interpretation:

  1. the zero state is scientifically meaningful, not just a technical nuisance
  2. the positive state has its own scientific meaning because the quality of the observed NIFM matters

So the main analysis below is organized in two parts:

  1. Part 1 models whether any NIFM occurs
  2. Part 2 models whether the observed NIFM is more informative, using minutiae >= 8 as the primary threshold

4 Descriptive Patterns

The table above already shows two notable patterns.

  1. The rate of NIFM occurrence varies widely by category, from about 23% in R22 to about 84% in R45.
  2. The average minutiae count among observed NIFMs is more stable, ranging from about 5.16 to 5.85, but the share of higher-minutiae NIFMs appears larger in the revolver categories R38 and R45.

Now plot those two patterns directly so the separation between occurrence and informativeness is easy to see.

Code
old_par <- par(mfrow = c(1, 2), mar = c(6, 4, 3, 1))

barplot(
  height = dat$occurrence_rate,
  names.arg = dat$group,
  ylim = c(0, 1),
  col = "steelblue",
  ylab = "Observed NIFM rate",
  main = "NIFM occurrence by group"
)

barplot(
  height = dat$p_ge_8,
  names.arg = dat$group,
  ylim = c(0, max(dat$p_ge_8) * 1.15),
  col = "tan3",
  ylab = "P(minutiae >= 8 | NIFM)",
  main = "High-minutiae share by group"
)

Code
par(old_par)

5 Part 1: NIFM Occurrence

For the first part of the hurdle model, the natural data structure is grouped binomial: each row gives the number of NIFMs observed out of the total number of tested rounds in that category.

5.1 Global Heterogeneity Test

Now test whether the seven groups have the same overall NIFM occurrence rate.

Code
occurrence_table <- as.matrix(dat[, c("total_nifm", "no_nifm")])
rownames(occurrence_table) <- dat$group
colnames(occurrence_table) <- c("NIFM", "No NIFM")

occurrence_chisq <- chisq.test(occurrence_table)
occurrence_chisq

    Pearson's Chi-squared test

data:  occurrence_table
X-squared = 221.99, df = 6, p-value < 2.2e-16

This test strongly rejects equal occurrence rates across the seven groups.

5.2 Group-Specific Occurrence Estimates

The simplest category-level version of Part 1 is a grouped-binomial model with one parameter per handgun-type / caliber group. This is just a convenient way to estimate the observed occurrence rates on the log-odds scale.

Now estimate the group-specific occurrence levels on the log-odds scale.

Code
occurrence_fit_group <- glm(
  cbind(total_nifm, no_nifm) ~ 0 + group,
  data = dat,
  family = binomial()
)

occurrence_coef <- data.frame(
  group = dat$group,
  occurrence_rate = round(dat$occurrence_rate, 3),
  log_odds = round(coef(occurrence_fit_group), 3)
)

knitr::kable(
  occurrence_coef,
  caption = "Group-specific occurrence estimates for Part 1 of the two-part analysis."
)
Group-specific occurrence estimates for Part 1 of the two-part analysis.
group occurrence_rate log_odds
groupA22 A22 0.253 -1.085
groupA32 A32 0.571 0.288
groupA38 A38 0.290 -0.894
groupA45 A45 0.281 -0.940
groupR22 R22 0.231 -1.202
groupR38 R38 0.738 1.037
groupR45 R45 0.843 1.682

5.3 Exploratory Main-Effects Grouped-Binomial Model

Because there are only seven grouped rows, any regression decomposition is exploratory. As a first pass, I fit a main-effects model with handgun type and caliber:

Now fit a simpler explanatory model that separates handgun type from caliber.

Code
occurrence_fit <- glm(
  cbind(total_nifm, no_nifm) ~ Type + factor(Caliber),
  data = dat,
  family = quasibinomial()
)

summary(occurrence_fit)

Call:
glm(formula = cbind(total_nifm, no_nifm) ~ Type + factor(Caliber), 
    family = quasibinomial(), data = dat)

Coefficients:
                  Estimate Std. Error t value Pr(>|t|)
(Intercept)        -1.9756     0.9059  -2.181    0.161
TypeR               1.3396     0.8404   1.594    0.252
factor(Caliber)32   2.2633     1.4607   1.549    0.261
factor(Caliber)38   1.2208     0.9773   1.249    0.338
factor(Caliber)45   1.4371     1.0462   1.374    0.303

(Dispersion parameter for quasibinomial family taken to be 33.76481)

    Null deviance: 222.783  on 6  degrees of freedom
Residual deviance:  63.977  on 2  degrees of freedom
AIC: NA

Number of Fisher Scoring iterations: 4
Code
occurrence_fitted <- data.frame(
  group = dat$group,
  observed = round(dat$occurrence_rate, 3),
  fitted = round(fitted(occurrence_fit), 3)
)

knitr::kable(
  occurrence_fitted,
  caption = "Observed and fitted NIFM occurrence rates from the exploratory grouped-binomial model."
)
Observed and fitted NIFM occurrence rates from the exploratory grouped-binomial model.
group observed fitted
A22 0.253 0.122
A32 0.571 0.571
A38 0.290 0.320
A45 0.281 0.369
R22 0.231 0.346
R38 0.738 0.642
R45 0.843 0.690

The main conclusion from this model is not that any single coefficient is precisely estimated; with only seven grouped rows, coefficient-level inference is fragile. The useful takeaway is that the data clearly show between-group heterogeneity in Part 1 of the hurdle model.

6 Part 2: Mark Informativeness Among Observed NIFMs

For the second part of the hurdle model, the question is no longer whether a NIFM exists. Instead, conditional on observing a NIFM, the question is whether some groups tend to produce more informative marks.

The raw minutiae categories are sparse in the upper tail, so I use coarser bands:

  • 4_5
  • 6_7
  • 8plus

6.1 Three-Band Distribution Test

Now compare the full low-versus-mid-versus-high minutiae distribution across groups.

Code
band_table <- cbind(
  `4_5` = rowSums(count_matrix[, minutiae_values %in% c(4, 5), drop = FALSE]),
  `6_7` = rowSums(count_matrix[, minutiae_values %in% c(6, 7), drop = FALSE]),
  `8plus` = rowSums(count_matrix[, minutiae_values >= 8, drop = FALSE])
)
rownames(band_table) <- dat$group

band_chisq <- chisq.test(band_table, simulate.p.value = TRUE, B = 100000)
band_chisq

    Pearson's Chi-squared test with simulated p-value (based on 1e+05
    replicates)

data:  band_table
X-squared = 22.671, df = NA, p-value = 0.02899
Code
knitr::kable(
  cbind(group = rownames(band_table), as.data.frame(band_table)),
  caption = "Observed minutiae-count bands among NIFMs by group."
)
Observed minutiae-count bands among NIFMs by group.
group 4_5 6_7 8plus
A22 A22 30 17 3
A32 A32 35 17 8
A38 A38 66 23 12
A45 A45 37 8 5
R22 R22 35 11 6
R38 R38 52 11 16
R45 R45 57 10 19

This provides evidence that the minutiae distribution differs across groups, although the effect is more modest than the group differences in overall NIFM occurrence.

6.2 Focused High-Minutiae Analysis

Because David’s notes emphasize the practical value of more informative marks, I also examine the probability that an observed NIFM has at least 8 minutiae.

Now focus on the more practical threshold question: among observed NIFMs, how often does a mark reach at least 8 minutiae?

Code
high_table <- cbind(
  High = dat$nifm_ge_8,
  Lower = dat$total_nifm - dat$nifm_ge_8
)
rownames(high_table) <- dat$group

high_chisq <- chisq.test(high_table)
high_chisq

    Pearson's Chi-squared test

data:  high_table
X-squared = 10.87, df = 6, p-value = 0.09247
Code
high_fit_group <- glm(
  cbind(nifm_ge_8, total_nifm - nifm_ge_8) ~ 0 + group,
  data = dat,
  family = binomial()
)

high_coef <- data.frame(
  group = dat$group,
  observed_probability = round(dat$p_ge_8, 3),
  log_odds = round(coef(high_fit_group), 3)
)

knitr::kable(
  high_coef,
  caption = "Group-specific high-detail probabilities for Part 2 of the two-part analysis."
)
Group-specific high-detail probabilities for Part 2 of the two-part analysis.
group observed_probability log_odds
groupA22 A22 0.060 -2.752
groupA32 A32 0.133 -1.872
groupA38 A38 0.119 -2.004
groupA45 A45 0.100 -2.197
groupR22 R22 0.115 -2.037
groupR38 R38 0.203 -1.371
groupR45 R45 0.221 -1.260
Code
high_fit <- glm(
  cbind(nifm_ge_8, total_nifm - nifm_ge_8) ~ Type + factor(Caliber),
  data = dat,
  family = quasibinomial()
)

summary(high_fit)

Call:
glm(formula = cbind(nifm_ge_8, total_nifm - nifm_ge_8) ~ Type + 
    factor(Caliber), family = quasibinomial(), data = dat)

Coefficients:
                  Estimate Std. Error t value Pr(>|t|)   
(Intercept)       -2.77098    0.12760 -21.717  0.00211 **
TypeR              0.74429    0.09455   7.872  0.01576 * 
factor(Caliber)32  0.89917    0.17553   5.123  0.03606 * 
factor(Caliber)38  0.70544    0.12968   5.440  0.03217 * 
factor(Caliber)45  0.71900    0.13282   5.414  0.03247 * 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for quasibinomial family taken to be 0.1007278)

    Null deviance: 11.01845  on 6  degrees of freedom
Residual deviance:  0.20439  on 2  degrees of freedom
AIC: NA

Number of Fisher Scoring iterations: 3
Code
high_fitted <- data.frame(
  group = dat$group,
  total_nifm = dat$total_nifm,
  observed = round(dat$p_ge_8, 3),
  fitted = round(fitted(high_fit), 3)
)

knitr::kable(
  high_fitted,
  caption = "Observed and fitted probabilities that an observed NIFM has at least 8 minutiae."
)
Observed and fitted probabilities that an observed NIFM has at least 8 minutiae.
group total_nifm observed fitted
A22 50 0.060 0.059
A32 60 0.133 0.133
A38 101 0.119 0.112
A45 50 0.100 0.114
R22 52 0.115 0.116
R38 79 0.203 0.211
R45 86 0.221 0.213

The simple chi-square test is suggestive but not definitive at the 0.05 level, while the exploratory main-effects model points in the same direction: revolver categories appear to yield a larger share of high-minutiae NIFMs than automatic categories, especially in R38 and R45.

7 Two-Part Interpretation

Taken together, the two-part results suggest that category differences operate more strongly through NIFM occurrence than through minutiae quality alone.

More specifically:

  1. Part 1 is the strongest signal in the data: some categories are much more likely than others to produce any NIFM.
  2. Part 2 also shows group differences, but they are more modest: among observed NIFMs, some groups appear more likely to produce marks with at least 8 minutiae.
  3. This means a category can be scientifically favorable for two different reasons:
    • it yields NIFMs often
    • or, when it yields them, they tend to be more informative

That distinction is exactly why the two-part framework is better suited to this problem than a single omnibus count model.

8 Interpretation

At this stage, the strongest empirical result is:

  1. The categories differ substantially in how often NIFMs occur at all.

A secondary result is:

  1. The categories may also differ in the quality distribution of observed NIFMs, with some revolver categories producing a larger fraction of higher-minutiae marks.

By contrast:

  1. The average minutiae count alone does not appear to separate the groups as strongly as the two-part analysis does.

9 Limitations

These analyses are a good starting point, but there are important limits to what can be claimed from the current file alone.

  1. The data are aggregated by category, so there is no adjustment for within-gun, within-donor, or within-load dependence.
  2. The absent R32 category is a structural gap and is not modeled directly here.
  3. The grouped main-effects models are exploratory because there are only seven category-level rows.
  4. The current data do not yet incorporate ESLR or any direct measure of associative value.
  5. Because the data are aggregated, this is a grouped approximation to a hurdle model, not the full individual-round version.