## Software

### Diagnostic Test Se and Sp Estimation

- 2 independent tests, 2 populations, no gold standard
WinBUGS 1.4 code to accompany the paper entitled "Branscum AJ, Gardner IA, Johnson WO. Estimation of diagnostic-test sensitivity and specificity through Bayesian modeling. Preventive Veterinary Medicine 2005; 68(2-4):145-163. DOI: 10.1016/j.prevetmed.2004.12.005".

Example from section 3.2.2.

Two independent tests, two populations

Estimate Se and Sp of microscopic examination and PCR.

Hui-Walter Model for N. salmonis in trout.

Data source: Enøe C, Georgiadis MP, Johnson WO. Estimation of sensitivity and specificity of diagnostic tests and disease prevalence when the true disease state is unknown. Preventive Veterinary Medicine 2000; 45(1-2):61-81. DOI: 10.1016/S0167-5877(00)00117-3.

`model{`

y[1:Q, 1:Q] ~ dmulti(p1[1:Q, 1:Q], n1)

z[1:Q, 1:Q] ~ dmulti(p2[1:Q, 1:Q], n2)

p1[1,1] <- pi1*Seme*Sepcr + (1-pi1)*(1-Spme)*(1-Sppcr)

p1[1,2] <- pi1*Seme*(1-Sepcr) + (1-pi1)*(1-Spme)*Sppcr

p1[2,1] <- pi1*(1-Seme)*Sepcr + (1-pi1)*Spme*(1-Sppcr)

p1[2,2] <- pi1*(1-Seme)*(1-Sepcr) + (1-pi1)*Spme*Sppcr

p2[1,1] <- pi2*Seme*Sepcr + (1-pi2)*(1-Spme)*(1-Sppcr)

p2[1,2] <- pi2*Seme*(1-Sepcr) + (1-pi2)*(1-Spme)*Sppcr

p2[2,1] <- pi2*(1-Seme)*Sepcr + (1-pi2)*Spme*(1-Sppcr)

p2[2,2] <- pi2*(1-Seme)*(1-Sepcr) + (1-pi2)*Spme*Sppcr

Seme ~ dbeta(2.82, 2.49) ## Mode=0.55, 95% sure Seme < 0.85

Spme ~ dbeta(15.7, 1.30) ## Mode=0.98, 95% sure Spme > 0.80

pi2 ~ dbeta(1.73, 2.71) ## Mode=0.30, 95% sure pi2 > 0.08

Sepcr ~ dbeta(8.29, 1.81) ## Mode=0.90, 95% sure Sepcr > 0.60

Sppcr ~ dbeta(10.69, 2.71) ## Mode=0.85, 95% sure Sppcr > 0.60

Z ~ dbern(tau1)

pi1star ~ dbeta(1.27, 9.65) ## Mode=0.03, 95% sure pi2 < 0.30

pi1 <- Z*pi1star

}

list(n2=30, n1=132, z=structure(.Data=c(3,0,24,3),.Dim=c(2,2)),

y=structure(.Data=c(0,0,3,129),.Dim=c(2,2)), Q=2, tau1=0.95)

list(Z=1, pi1star=0.03, pi2=0.30, Seme=0.55, Spme=0.98, Sepcr=0.90, Sppcr=0.85node mean sd MC error 2.50% median 97.50% start sample

Seme 0.1745 0.0652 2.32E-04 0.06738 0.1678 0.3195 10001 100000

Sepcr 0.9301 0.04626 1.86E-04 0.8173 0.9391 0.9919 10001 100000

Spme 0.9914 0.007498 3.93E-05 0.9717 0.9935 0.9996 10001 100000

Sppcr 0.963 0.01671 6.88E-05 0.9247 0.9651 0.9893 10001 100000

pi1 0.009058 0.01219 5.79E-05 0 0.003916 0.04176 10001 100000

pi2 0.8524 0.06658 2.56E-04 0.7027 0.8596 0.9596 10001 100000- 2 independent tests, 2 populations, no gold standard (TAGS) - frequentist approach
Dubey et al. 1995 (Dubey JP, Thulliez P, Weigel RM, Andrews CD, Lind P, Powell EC. Sensitivity and specificity of various serologic tests for detection of Toxoplasma gondii infection in naturally infected sows. American Journal of Veterinary Research 1995; 56(8):1030-1036) compared 5 serologic tests for the diagnosis of toxoplasmosis in 1000 naturally-exposed sows using bioassay methods as the gold standard (definitive test). Bioassays were done in mice (all sows) and cats (183 sows) using cardiac muscle from sampled sows. Samples in the study were collected in two batches: nos. 1-463 and 464-1000.

If Toxoplasma gondii were isolated from either mice or cats, the sow was considered infected. A sow was considered non-infected if the bioassay results were negative. To demonstrate use of TAGS, we will use results of the bioassay test and 2 of the 5 serologic tests: the modified agglutination test (MAT) and the enzyme-linked immunosorbent assay (ELISA). These 2 serologic tests were the most accurate of the tests evaluated and are commonly used in screening of pigs for toxoplasmosis. The MAT was considered positive if the titer was >= 20 and the ELISA was positive if the OD value was > 0.36. The sensitivity (Se) and specificity (Sp) of the MAT using bioassay as the gold standard were calculated to be 82.9% and 90.2%, respectively. This represents the traditional approach that assumes that the combined bioassay (cat + mouse) has Se = Sp = 1.

**Calculations**The TAGS method requires a minimum of 2 populations with 2 conditionally independent tests. Use of the batch data (batch 1 = 463; batch 2 = 537) provides a logical way to create 2 populations of similar size in which the sensitivity and specificity of the tests should be equivalent. Cross-classified test results for the batches are in the following table:

Batch MAT+/ MAT+/ MAT-/ MAT-/

Bioassay+ Bioassay- Bioassay+ Bioassay-

1 37 55 7 364

2 104 26 22 385

What is the sensitivity and specificity of the MAT using “TAGS” when the MAT and bioassay are the tests under consideration?

Estimates for the MAT using TAGS are exactly the same as using the traditional approach i.e. Se = 82.9% and Sp = 90.2%. Also, TAGS estimated that the bioassay was perfectly sensitive and specific – exactly the same as the traditional approach. Hence, this provides evidence in support of bioassay as a true gold standard.

Now let ’s ignore the bioassay results and the use the cross-classified data from MAT and ELISA by batch as follows (1 pig had a missing ELISA value):

Batch MAT+/ MAT+/ MAT-/ MAT-/

ELISA+ ELISA- ELISA+ ELISA-

1 67 25 41 329

2 97 33 36 371

What is the sensitivity and specificity of the MAT now using “TAGS” when the MAT and the ELISA are the tests under comparison?

MAT is estimated to have a Se = 100% and a Sp = 97.3% which clearly are overestimated compared with the correct values obtained when bioassay is used at the comparison.

What is the most likely reason for the change in MAT estimates?

Results of the two tests (MAT and ELISA) are positively correlated (dependent), conditional on infection status (see Gardner IA, Stryhn H, Lind P, Collins MT. Conditional dependence between tests affects the diagnosis and surveillance of animal diseases. Preventive Veterinary Medicine 2000; 45(1-2):107-122. DOI: 10.1016/S0167-5877(00)00119-7). This positive dependence between the tests results in overestimation of MAT accuracy with TAGS.

The main conclusion is that is inappropriate to use the TAGS software to compare 2 serologic tests without a gold standard because the tests are likely to be dependent. Other methods (and software) that account for this dependence (2 conditionally dependent tests, 2 populations) are necessary to obtain unbiased estimates.

- 2 independent tests, 2 populations, no gold standard - spreadsheet workbook
- Excel workbook to estimate Se and Sp, and for frequentist sample size calculations.
- 2 dependent tests, 1 population, no gold standard
WinBUGS 1.4 code to accompany the paper entitled "Branscum AJ, Gardner IA, Johnson WO. Estimation of diagnostic-test sensitivity and specificity through Bayesian modeling. Preventive Veterinary Medicine 2005; 68(2-4):145-163. DOI: 10.1016/j.prevetmed.2004.12.005".

Example from section 3.3.3.1.

Two dependent tests, one population.

Estimation of the Se and Sp of two FAT tests.

Classical swine fever virus.

Data source: Bouma A, Stegeman JA, Engel B, de Kluijver EP, Elbers AR, De Jong MC. Evaluation of diagnostic tests for the detection of classical swine fever in the field without a gold standard. Journal of Veterinary Diagnostic Investigation 2001; 13(5):383-388. DOI: 10.1177/104063870101300503.

`model{`

x[1:4] ~ dmulti(p[1:4], n)

p[1] <- pi*(Sefat1*Sefat2+covDp) + (1-pi)*((1-Spfat1)*(1-Spfat2)+covDn)

p[2] <- pi*(Sefat1*(1-Sefat2)-covDp) + (1-pi)*((1-Spfat1)*Spfat2-covDn)

p[3] <- pi*((1-Sefat1)*Sefat2-covDp) + (1-pi)*(Spfat1*(1-Spfat2)-covDn)

p[4] <- pi*((1-Sefat1)*(1-Sefat2)+covDp) + (1-pi)*(Spfat1*Spfat2+covDn)

ls <- (Sefat1-1)*(1-Sefat2)

us <- min(Sefat1,Sefat2) - Sefat1*Sefat2

lc <- (Spfat1-1)*(1-Spfat2)

uc <- min(Spfat1,Spfat2) - Spfat1*Spfat2

pi ~ dbeta(13.322, 6.281) ### Mode=0.70, 95% sure > 0.50

Sefat1 ~ dbeta(9.628,3.876) ### Mode=0.75, 95% sure > 0.50

Spfat1 ~ dbeta(15.034, 2.559) ### Mode=0.90, 95% sure > 0.70

Sefat2 ~ dbeta(9.628, 3.876) ### Mode=0.75, 95% sure > 0.50

Spfat2 ~ dbeta(15.034, 2.559) ### Mode=0.90, 95% sure > 0.70

covDn ~ dunif(lc, uc)

covDp ~ dunif(ls, us)

rhoD <- covDp / sqrt(Sefat1*(1-Sefat1)*Sefat2*(1-Sefat2))

rhoDc <- covDn / sqrt(Spfat1*(1-Spfat1)*Spfat2*(1-Spfat2))

}

list(n=214, x=c(121,6,16,71))

list(pi=0.7, Sefat1=0.75, Spfat1=0.90, Sefat2=0.75, Spfat2=0.90)node mean sd MC error 2.50% median 97.50% start sample

Sefat1 0.7691 0.06151 9.16E-04 0.6497 0.7693 0.887 10001 100000

Sefat2 0.8147 0.06045 9.83E-04 0.6959 0.8156 0.9281 10001 100000

Spfat1 0.8851 0.0581 5.35E-04 0.7533 0.8924 0.975 10001 100000

Spfat2 0.8485 0.07411 5.83E-04 0.6856 0.8566 0.9667 10001 100000

rhoD 0.6814 0.1484 0.001845 0.3135 0.7049 0.9071 10001 100000

rhoDc 0.3629 0.2655 0.002054 -0.09339 0.361 0.858 10001 100000