It was in the 6th article that the buggy series began probing the numerous flaws and mishaps that ripple and swell throughout Pape's statistical work.
The buggy analysis there moved at a fast, fairly top-skimming approach, exactly as prof bug observed it would. Today's article is different. It slows down the pace and intensifies the scrutiny of Pape's statistical ship-wreck, which reflects, as it sinks bow-first into some dark deep miasma, the teeming blunders, glitches, and fallacies that swarm throughout Dying to Win and wear away its theoretical understructure and guidance-systems. Those flaws aren't confined to theoretical generalizations and statistical work. They also haunt almost all of Pape's major data-sets, starting with the very first table on p. 15 and on through subsequent chapters, including the one concocted for use in logit modeling in Chapter 6.
And what does this ship-wreck leave in its tow? Well, whether intentionally or not, this clear impression: both Pape's theory and the statistical work add up to a whitewashing of the dominant role of Islamist extremism in the rash of suicide terrorism that has erupted on the world scene since 1980.
One Final Point
1. Today's buggy argument begins with Part Four. That's purposeful, a clear indicator that it's a direct follow-up of the previous article in this series on Robert Pape. It sets out in detail what were presumably Pape's logit-models, including the final . . . well, final "two" fitted models that p. 99 and fn. 43 on p. 294 refer briefly to.
2. Part Five probes the various technical problems that beset Pape's data-set of 58 cases.
The previous buggy article, recall, examined the biases, glitches, and havoc-making howlers that riddled the data-set in substantive ways --- above all, its total exclusion of a good 10 Islamist terrorist groups that carried out suicidal attacks between 1980 and 2003's end. Not only did this exclusion further conceal the overwhelmingly dominant role of Islamist groups in those suicidal attacks --- a total of 17 of the 19 such terrorist groups --- but it allowed Pape to focus his logit modeling on only military occupations of nationalist minorities or alien populations by democratic countries. In this sloppy lopsided manner, without ever owing up to it, Pape solved much of his regression-work by definition . . . not that the final statistical results that finally emerged didn't founder anyway . . . or so we will see in greater detail today.
The argument in Part Five switches focus and probes the technical troubles of Pape's wholly self-created data-set: not the ways he selectively included only cases that fitted his theory, or coded, classified, and organized the 58 cases he claims to have found, but rather whether its small size isn't suitable for logistic regression . . . at any rate with standard maximum likelihood estimations, the standard way all software packages calculate the coefficients of variables in logistic regression. Several indicators converge to show that Pape's sample size is simply inadequate and will produce completely unreliable "effects" for the coefficients and equally unreliable outcomes for the binary dependent variable (the frequency rates at which suicide terrorism was observed to occur or not occur in the total 58 cases).
The inadequate size of Pape's self-made sample set need not have been impossible to run with fairly accurate estimates of coefficient values, provided he had recognized his troubles with its size and resorted to what are called "exact methods" of logistic estimation. There's a 99% likelihood that he didnit. Pape's knowledge of logistic regression and logit analysis, you see, seems confined to basics and a strictly cookbook approach to statistical work --- unless, of course, he has been deliberately conning us with disastrous logit results on p. 99 and in fn. 43 on p. 294; and most likely he never realized that his sample set was too small for proper maximum likelihood estimation and that he needed to fall back on "exact methods" for small samples.
3. Part Six --- which will appear in the next buggy article (the 8th in this series) --- will delve deeply then into more of the technical howlers that ripple through Pape's statistical work, focusing especially on his astonishingly naive and totally misleading interpretations of what his reported logit modeling amounted to.
Remember What Was Said in the Previous Buggy Article
As the warning sounded in its introductory part noted, we're reduced to some speculation in the analysis of Pape's logit modeling here.
No help for it. Pape's astonishingly stingy reports of his logit modeling --- how many models were specified, how they tested against one another, what the estimated coefficients of his variables were in each, what tests of their statistical significance he actually used, and what, if any, goodness-of-fit test for overall model performance he used --- leaves us no choice. Fortunately, if we assume certain things and infer some other things from his briefly reported models on p. 99 --- presumably his 2nd and 4th specified logit models --- we can be relatively confident about most of what follows.
In any case, something certain will emerge by the end of today's buggy argument: in a word, what a disastrous hash of a wrong, badly misconceived theory Pape's logit modeling turns out to be . . . a statistical fiasco from start to finish.
The fiasco starts with a lopsidedly wrong data-set that Pape uses as a sample selection --- equal to its population --- for his logistic regressions. The disaster continues with erroneous interpretations of his reported logit models' success as "predictive models"; it very likely envelopes his interaction term, a key independent variable in at least the two models he reports on briefly; it infests his likely estimated coefficients quite apart from the misinterpreted interaction term --- a near-certainty, as we'll see, with the existence of what's called a "zero-cell defect" in his reported classification results (these play havoc with logit estimations); and it shoots up and multiplies in . . . well, no need to run ahead of our argument here. Parts Two and Three will be the place to delve into these troubles.
First Things First Though
Not surprisingly, in Pape's analysis, those world-views and strategic calculations turn out to have little or nothing to do with Islamist fundamentalism --- itself, Pape assures us at the start of chapter 7, peaceful and mainly concerned to fight off western cultural and economic imperialism --- and almost everything to do with nationalist resistance to such neo-colonialism . . . the latter, please note, a term that Pape seems himself to subscribe to. At any rate, he doesn't contradict it, and he even insists that what counts anyway is how bin Laden and his terrorist associates see the world, not how average Americans or others do.
On Pape's view, then, bin Laden and al Qaeda are animated overwhelmingly by motives and strategic calculations that are fully in line with Pape's nationalist theory of suicide terrorism. Presumably, too, so are its loose affiliates world-wide as well as its imitator terrorists in the Muslim world.
Not that Pape himself uses the term "Muslim world." It's a no-no for him --- and for reasons you're thoroughly familiar with by now.
Why Pape Has To Use Logit Modeling
He has to --- or some other, if less popular non-linear regression technique --- because his outcome or dependent variable --- whether suicide terrorism occurs or not --- is a binary qualitative variable that is inherently non-linear. The alternative outcomes --- if suicide terrorism occurs (Y = 1) or not (Y = 0) --- can't be regressed with linear equations for reasons that should be familiar to all of you.
As for Pape's independent variables on which the dependent variable is regressed --- once the non-linear equations are transformed by the logit model into linear ones using logged odds to the base e
--- they are some combination of nationalist rebellion and religious differences between a democratic military occupier and the occupied people.
So far, so good. But wait!
Look back at Pape's diagram of his theory's causal influences and pathways. What has happened to "occupation" as a causal influence in his logit models? In particular, why doesn't it figure as an independent estimating variable (or covariate or predictor) in the models whose equations will be set out in a moment?
Well, as we noted in the preliminary remarks earlier today, we're dealing with another slippery technique of Pape's. Note that in the diagram, the role of "occupation" isn't confined there to democratic military occupiers; it refers in general to any occupation. By definition, however, the data-set that Pape has assembled, coded, and classified to run his logit models on is limited to occupations by democratic countries. The result? It's two-fold: 1) Instead of hundreds of occupations by democratic and non-democratic governments between 1980 and 2003 of territorial based ethnic minorities that resented central government control, Pape by definition limits the disputes to 58 cases, all involving democratic countries where, it turns out, 9 suicide terrorist groups emerged. And 2) Pape's limited data-set then compounds the problems by ignoring at least 9 or 10 Islamist-inspired suicide terrorist groups that carried out attacks against Muslim autocratic countries.
And Now Our Buggy Conjectures About Pape's Logit Modeling
Despite the Surprisingly Spare Reporting, We Will Assume Pape Followed Proper
With these big provisos in mind, we can assume that he had specified 4 or 5 models. Strictly speaking, given the need to test each logit model for its overall model performance as he added or experimented with different independent variables --- or estimators or predictors (in logistic regression terminology) or covariates (ditto)--- he would have likely opted for 5 models, starting with a baseline one in which any independent variables are set to 0 and he looked at the performance of the model using only the constant or intercept term and run it on his data-set of 58 cases.
Even so, we'll list the base mode with the constant or intercept term only and label it 1.1 and the theory-inspired first model as 1.2. As you can see immediately below, the intercept-only model is estimated with the covariates or independent variables set to zero. (Technically speaking, the two other independent variables that Pape's logit models use should be included and set to zero too.)
Pape's First Two Logit Models
Y = ln[p/(1-p)] = a + b0X + b0Z (1.1)
Y = ln[p/(1-p)] = a + b1X + b2Z (1.2 )
* ln[p/(1-p)] = the logit = the logged odds that suicide terrorism (its value equal to "1") will occur if someone were to pick a random case from Pape's 58 data sample and observe whether in fact it will occur or not the next time.
*a = the constant or intercept, whose final value Y takes when the X and Z independent variables all equal zero
*b1X = nationalist rebellion
*b2Z = religious differences between the occupier and the occupied people
As you've probably already inferred, the two independent variables --- nationalist rebellion and religious differences --- are qualitative binary variables too, called "categorical" variables . . . or estimators or covariates or "predictors" in logistic jargon.
That means that they too are treated like dummy variables, exactly like the dependent or outcome variable --- suicide terrorism is observed to either occur or not occur when Pape applies his logit models to his self-created data-set. There's nothing wrong with dummy variables used as independent variables. They appear all the time in linear regression too. It's the dependent binary qualitative variable that requires the use of non-linear regression like logistic.
The use of the independent dummy estimators, though, does require --- as you'll see momentarily --- a larger sample size than otherwise, even if Pape didn't realize this himself. (This is especially true when the distribution of the categorical variables in a logit model leans heavily to one value --- which as we'll see is true for Pape's reported 2nd model. In fact, there are only 9 occurrences of suicide terrorism in his relatively small sample size --- equal to the population of "democratic" military population of 58 total cases --- and a classified output of the sort that appears on p. 99 is lopsided toward its non-occurrence. Worse, there's a zero-cell in the reported classification table, which plays havoc as we'll also see with all the estimators' coefficients.)
Note that all the subsequent independent variables --- one more for the next two logit models Pape constructed --- are also categorical and hence have to be treated as dummy variables too.
Note, too, that once you've grasped the logit identity's role as it appears in the two specified models above --- ln[p/(1-p)] --- we'll take it for granted that you know logit analysis has transformed probabilities into odds and logged them to the base e
. . . the natural logarithm, so we'll just drop it from the next equations. What that logit transformation does is remove both the base and ceiling of probability estimates --- 0, 1 --- and in effect stretches out the S curve of the original non-linear cdf to look linear.
Pape's Reported 2nd Logit Model
Pape's astonishingly bare-boned report on this 2nd model appears on p. 99 and in a fn. 43 on p. 294 in Dying to Win
. In particular, the reported "predictive success" he claims on p. 99, as we'll see soon, appears in a 2x2 classification table that, typically, is misinterpreted by Pape on several counts: the classified results perform miserably, it isn't a prediction model in any dictionary sense of the term, it isn't even a prediction-model by those logistic-regression theorists who take "predictive efficiency" seriously (like Scott Menard) --- rather, not even a classification table but what's called a "selection table" --- and . . . well, we'll get to the tangled confusion soon enough in Part Five. Note here that if Pape didn't distinguish between these three kinds of tables, his logistic software --- whichever program he ran --- will have not properly tested the observed vs. predicted outcomes for statistical significance. He would have needed to apply the proper statistic and calculate the significance himself. (See Scott Menard, Applied Logistic Regression (Sage University Paper, 2nd ed., 2001), pp. 28-33)
Here, meanwhile, is what the second model with a 3rd estimator added to 1b's: an interaction term for nationalist rebellion and religious different working in tandem on the behavior of the outcome variable --- suicide terrorism's occurrence or not:
Y = a + b1X + b2Z + b3XZ 2.0
b3XZ = the interaction term just mentioned
Pape's Barely Referred to 4th Logit Model:
Y = a + b1X + b2Z + b3XZ + b4C 4.0
b4C = concession made by democratic occupying countries to armed nationalist rebels (including terrorist groups) that, according to Pape, stopped them from resorting to suicide terrorism. /span>
As you'll see, Pape invokes this 4th independent variable on p. 99 totally out of the blue with head-spinning abruptness: it doesn't appear at all in the diagrammed causal-pathways that appears on p. 96 of Dying to Win, and it is mentioned in passing for half a paragraph, followed by the itty-bitty fonts in the weird table on p. 100. After which, it isn't mentioned again for 139 pages in the rest of his book --- on p. 239, the text itself ending on p. 250.
As we noted earlier and will clarify in Part Six, its sudden invocation looks like a desperate, last-second rescue-job by Pape to save his reported 2nd logit model's results from disastrous mediocrity. (There's also a 3rd logit model Pape performed, mentioned for a sentence or two in fn 43 on p. 294: it substituted "linguistic difference" for religious difference, but Pape reported that the logit model couldn't be estimated properly because there wasn't enough variation across the 58 cases of military occupation for logistic regression's estimating procedure --- maximum likelihood estimation --- to be carried out and successfully distinguish the effects of the model's independent variables on the behavior of the outcome variable, suicide terrorism's occurrence or not.)
As for the 3rd logit model of Pape's, don't worry. Prof bug hasn't forgotten about it. It uses a different estimator in place of "religious difference" --- specifically, "linguistic difference" --- and as we'll see soon enough, Pape rejects it as not producing enough variation across the data-set to generate a statistically sound model.
A Few Clarifying Comments On These Three or Four Logit Models Seem In Order
The stress here is largely on clarification, not criticism. There will be plenty of time for criticism in Part Six.
(i) How To Assess Overall Model Performance
There are different tests to use for assessing each logit model's overall performance, starting with the improvement (if any) over Pape's baseline intercept only model, 1a.
Most likely, he would have decided to apply a "log-likelihood ratio test" or G (or Gm), which is reported automatically in every software statistic package for logistic regression when two different models are compared for model performance . . . at any rate when the models are nested. Meaning?
Take two logit models. The 1st is nested in the 2nd if the latter contains all the first model's independent variables and adds 1 or more new ones. The earlier model is then the "restricted" or "reduced" model whose overall performance is being compared with the more elaborate or "expanded" model. Take Pape's logit models. His baseline or null model --- the one with only the intercept or constant term (1.1) --- is nested in the more expanded 2nd logit model (1.2) with its two estimating variables or predictors or independent variables. Similarly 1.2 is nested as a restricted model in Pape's 2nd model with the interaction term.
Y = a + b0X + b0Z (1.1)
Y = a + b1X + b2Z (1.2 )
Y = a + b1X + b2Z + b3XZ (2.0)
Nested models have the advantage that if you compare the expanded model with, initially, only 1 new independent variable added to the restricted model, you automatically are also testing for the statistical significance of that independent variable's estimated coefficient or parameter value . . . which means that you want to see if that coefficient value can be distinguished from random chance and disturbing conditions at the 0.05 level or better. Another name for the log-likelihood ratio is model chi-square, by the way, and it's based on -2LL (deviance) . . . with deviance the term used for the distribution of the error term in logistic regression the same way that "unexplained variance" is in linear regression for the error term.
The Easiest Way To Test For Model Performance
The easiest way --- it saves calculating time --- would be for Pape to compute the log-likelihood test for the baseline null mode --- 1a, with only the constant term added --- and for his reported final researcher's model, which happens to be the 2nd one with the three categorical or qualitative variables added to it.
He would then subtract the -2LL result or value from the computed -2LL result for the null or baseline model. (The lower the -2LL value, the better is the model's performance.) As with any hypothesis testing for statistical significance, Pape would then take this reported difference between his 2nd researcher's model and the baseline model, allow for what's called degrees of freedom that reflects the model's number of variables, and compare the difference-value with the "critical values" in a chi-square table at the 0.05 level (or 0.01). If the difference-value between the two logit models is higher than the critical value at that level, then Pape could conclude that the 2nd model's "effects" --- measured by the calculated coefficients of the independent variables --- on the behavior of the outcome variable (suicide terrorism occurring or not) isn't due to random chance or unforeseen disturbances but actually describes their influences or "effects" on Y.
The null hypothesis that would be rejected in this case is that the independent variables make no difference in accounting for Y or the occurrence or not of suicide terrorism. And keep in mind that Pape's independent variables are all categorical or qualitative in nature, treated as dummy variables: for instance, religious difference occurs (= 1) or doesn't occur (=0) when Pape ran his 2nd model on his data-set of 58 cases. That means the log-likelihood test for it and the other categorical variables --- three in all for the 2nd model --- would be performed for both alternatives, 1 and 0.
Another, More Important Way of Testing Model Performance
SPSS, prof bug's software (version 12 for Windows), reports that another test is the most reliable for model performance --- Hosmer and Lemeshow's Goodness of Fit Test, which (as the title of it shows) is sometimes distinguished from strict tests for model performance. Whatever, for our purposes the two kinds of tests do the same thing: they take the final fitted model --- which has been tested, say, with a -2LL statistic compared to nested models or the baseline model --- and assume simultaneously that this fitted logit model contains all the relevant estimating variables, entered in the proper functional form, with each of these independent estimators and the constant term themselves tested for statistical significance. What Hosmer and Lemeshow's Goodness of Fit Test does is then determine how effectively that model actually describes the behavior of the output variable --- the number of times or frequency with which, in Pape's model, suicide terrorism occurs or not when his data-set of 58 cases is run, one at a time, on the estimators.
No need to go into the details of the Hosmer-Lemeshow test itself. Simply note that Pape provides absolutely no information about its results on either p. 99 or in fn. 43 on p. 294.
(ii) Testing the Statistical Significance of the Individual Logistic Coefficients
The loglikelihood test for model performance, has, to boot, an added advantage: it automatically tests for the statistical significance of each new independent variable added by Pape to the next 4 logit models above and beyond the baseline intercept-only model. Log-likelihood ratio tests, for those interested, are calculated in a couple of different ways.
An alternative way to test the statistical significance of each independent variable in logistic regression --- one at a time ---- is the Wald statistic . . . the ratio of the non-standardized logit coefficient value of an independent variable to its standard error. It's calculated by all logistic software, but there are problems with the accuracy of its estimations when the coefficient value ("effect" in logistic jargon) is large and the resulting Wald chi-square value is small. For this reason, most logistic theorists argue for the superiority of the -2LL variable's accuracy, even though it's a good idea for a researcher to apply it to the individual coefficients (parameter values) of the estimators or predictors in a logit model, and see how it compares with a -2LL ratio test.
(iii) What About Pape's 4th Logit Model?
As you've probably noted, Pape's 2nd model was described a moment ago as his final "researcher" model. It's the one he reports on the most on p. 99 in a 2x2 table there and in fn. 43 on p. 294. What about his 4th logit model, however --- which contains a fourth variable . . . the last-second salvation-estimator that reflects the role of "concessions" in thwarting the resort, or so Pape alleges, of nationalist rebels to suicide terrorism in certain cases?
Well, as we said in the prologue today, it really doesn't look like anything other than a spur-of-the-moment addition when the galley-proofs were already in Pape's hands and a deadline hung over his head, that he hastily concocted because someone --- maybe one of his 20 expert readers or one of his 16 research assistants --- informed him just how mediocre the results of his 2nd logit model happened to be . . . at any rate as he interpreted them on p. 99. Enough said about the 4th model for the time being. It will be analyzed in greater depth later on in Part Six.
The 3rd original logit model, we assume, is the one that can't be treated for nesting purposes: it substitutes "linguistic purposes" for "religious differences" as a comparison of performance, and Pape would have to treat that 3rd model and his researcher's 2nd model --- or 4th model if it turned out not to be just a spur-of-the-moment after-thought --- as two different models. That said, he would still apply something like a -2LL ratio-test to see which performed better. As it is, we'll see that Pape found that the 3rd logit model with linguistic differences substituting for religious differences between occupier and occupied people could actually be estimated properly. The reason: not enough "variance" across the 58 cases for a logistic regression to estimate properly.)
One clarification: with or without nesting, any comparisons of Pape's logit models would have to take into account that his independent variables are all categorical or qualitative in nature and are treated as a binary dummy model. When religious difference, for instance, shows up as existing in, say, the 18th case of Pape's 58 cases in total, then it is recorded as "1" or a positive. By contrast, the 19th case of military occupation might not show any religious difference and the logit estimation of that dummy variable's coefficient or parameter value would record a "0".
The existence of independent categorical variables also means that the degrees of freedom for statistical testing have to take into account the binary dummy values of those variables
(iv.) A Concrete Example of Model Performance
Here's a concrete example, using the following hypothetical reported analysis of model fitting (or performance) in SPSS logistic software for Pape's first two models, 1a with only the intercept and 1b with the two independent estimating variables --- or covariates or estimators or "predictors" --- as they influence the behavior of the outcome variable, Y: the frequency with which suicide terrorism will occur or not, given these two initial models.
Model Performance Information
How To Test For Model Fitting or Performance Illustrated with -2LL
|Model ||-2 log |
|Intercept Only 1a ||284.429 ||- ||- ||- |
|Pape 1b Model ||265.972 ||18.457 ||6 ||.005 |
To determine whether there's a statistically valid relationship in the first model (1b) between the dependent or outcome variable (suicide terrorism) and the combination of independent variables that Pape specifies for 1b and the other logit models, SPSS --- or other software packages --- will automatically use the model chi-square (or -2 log-likelihood) test and produce it in a table entitled Model Fitting Information. The presence of a relationship between the dependent variable and combination of independent variables is based on the statistical significance of the model chi-square or -2 log-likelihood test in the SPSS table titled "Model Fitting Information". Another term, which prof bug and others prefer, is Model Performance.
In this strictly hypothetical analysis, the probability of the model chi-square --- which is purely hypothetical (Pape providing no information whatever here) --- turns out to be 18.457, a figure obtained by subtracting the -2LL result (which is chi-sq. distributed) of Pape's model 1b from the intercept-only model's -2LL result. The lower the result, the better the model's independent variable's values (coefficients) perform. (Some applied logistic researchers never learn why -2 appears in the formula. Using minus twice its log creates a value whose distribution is recognized, making it suitable for hypothesis testing that will be discussed in a moment or two.)
You then take the difference between 1a's and 1b's chi-sq. results and look up in a chi-sq table for "critical values" to see whether that difference --- 18.457 --- is statistically significant at, say the 0.05 level or the 0.01 level. If the difference --- 18.457, adjusted for degrees of freedom (df) --- exceeds the table's critical values at either level, the modeler can assume that there's only a 5% likelihood or less that the estimated coefficients values (results or effects in logistic regression) are due to random chance or disturbances.
In this hypothetical case, the resulting level of significance is 0.005 --- which is, as you can see, considerably better than a level of significance at 0.05. The resulting level of significance is also called the p-value.
It indicates there's less than 1.0% probability --- precisely 0.5% probability --- that the difference in model performance between Pape's baseline model (1a, with only an intercept) and model 1b with its two independent variables is due to random chance and unforeseen disturbing conditions. (The two independent variables in 1a, remember, are religious differences and nationalist rebellions as influences on the occurrence or not of suicide terrorism.)
More precisely, statistical modelers like Pape would use the chi-square result --- an improvement of 18.457, statistically significant at the 0.005 level --- to reject the null hypothesis that there's no difference between model 1a's and model 1b's overall performance.
Generally, out of habit, the critical value of distinguishing between a statistically sound outcome and one that could be due to random chance is a level of 0.05 or 5%. Why use a null hypothesis this way? Because, in statistical work, something can't be positively proved to be a cause of an outcome: an outcome is said to be 100% certain to occur when a cause or combination is known to exist, something no statistical test can do. At most you can with a certain degree of probability short of 100% --- in this case 99.95% --- that one or more of Pape's two independent variables has influenced the occurrence of suicide terrorism.
Degrees of Freedom
One other comment about the hypothetical report in the table above. The term "df" refers to degrees of freedom --- which means that when someone like Pape is checking on the critical value of his logit models' individual performance in a chi-square table, he has to take into account the number of variables in each model. Observe in passing here that the calculation of df is more complex than in linear regression models, though most logistic software should (prof bug assumes) do this automatically for the researcher.
In particular, you not only have to consider any categorical variables at different levels --- binary, tripartite etc --- but with ordinal variables ("tall" vs. "average" vs "short" in height: 1, 2, and 3) the calculation involves a special formula.
Other Tests and References
We've mentioned the -2LL ratio test and the Hosmer-Lemeshow goodness-of-fit test for overall model performance. We also mentioned in passing the Wald statistic that can, with care, be applied to the individual coefficients to test for statistical significance.
Additionally, some logistic theorists have specified algorithms that approximate what R-square ---the coefficient of determination --- does for linear regression, but most specialists have doubts that they work as well for logistic regression. (Among the reasons why, a practical one happens to be --- as Hosmer and Lemeshow note in their outstanding book, Applied Logistic Regression
[Wiley, 2nd ed., 2000), p. 167 --- s that low R-square values tend to be the norm in logistic regression. For scholars and others who are accustomed to linear regression, these low values suggest something wrong with the tested model's performance when it might be sound in dealing with non-linear outcomes like Pape's binary dependent variable --- suicide terrorism occurs or doesn't occur.
Those who are interested in various tests will find this online article
a good if encyclopedia-like listing of them and other logistic regression concerns.
(v) Interaction Terms Clarified
An interaction term of the sort Pape uses for logit models 2.0 and 4.0 is not just perfectly acceptable, but desirable in any kind of regression modeling, whether linear or non-linear. What, then, does an interaction term for an independent variable mean?
Essentially this: interaction exists in a regression model whenever the influence of an independent variable on the behavior or value of the dependent variable --- in Pape's case, how frequently suicide terrorism will be observed to occur when his logit models are run on his data-set --- hinges on the magnitude of another independent variable. (The technical term for the influence of an independent variable on the dependent variable is its "effect" --- represented in logistic regression by the coefficient value expressed in a log odds ratio or odds ratio or in probability terms. The odds ratio, as the previous buggy article showed, is easier to make sense of than a log odds ratio, and probability estimates are complex because the partial derivative or slope change for the S-curved cumulative density function will likely vary anywhere on the curve. )
Suppose, to take an example from an article by Edward Norton and two colleagues that will be cited shortly, we ask whether the effect [influence] of car weight on gas mileage --- the dependent variable of interest here --- is the same for both domestic and foreign cars.
The way to find out is to ‘run a regression to "predict" gas mileage [the dependent or outcome variable] as a function of car weight, a dummy variable for foreign car --- which means y = 1 for foreign car, y = 0 for domestic car --- and an interaction between the two' And so? "If the coefficient on the interaction term is statistically significant, then there is a difference between domestic and foreign cars in how additional weight affects mileage." (Note that this is not itself a non-linear equation as stated, and the regression model could be specified as linear. It could be represented as a non-linear regression if a cut-off mileage were used as the dependent variable: say, 25 miles per gallon. In that case, higher than 25 mpg could be coded as 1, and automatically 0 would equal less than 25 mpg. )
A More Illuminating Example
Here's an illustration of another example, which is taken from the best overall book on logistic regression ever written: David W. Hosmer and Stanley Lemeshow, Applied Logistic Regression
(Wiley, 2nd ed: 2000), p. 71. ), which you'll recall we mentioned a few moments ago. [Oriented toward medicine and the other health sciences, the book is a marvel of both lucid exposition and mathematical analysis, with concrete examples followed throughout its 352 pages; both Hosmer and Lemeshow have two goodness-of-fit or model-performance tests named after them that are widely used these days. In many ways, the use of medical examples for logit modeling is an especially engrossing part of the book.]
Interaction Variable's Effect Illustrated
The two authors distinguish the two by how they perform on a statistical test: clinically or theoretically viewed, a confounder may not pass a statistical test for significance --- its interaction with another independent variable makes sense clinically or theoretically, as does its effect or influence on the behavior of the dependent variable (suicide terrorism in Pape's case), but unlike a clear interaction variable it doesn't meet technical statistical standards for being distinguished from random chance at, say, a 0.05 level.
By contrast, an interaction effect does pass such a test and also makes clinical or theoretical sense. As for a confounder, Hosmer and Lemeshow argue that it's up to the researcher to decide whether the confounder --- despites the statistical drawback --- should still stay in a model based on his or her clinical understanding. [A variant, of course, would be to see if the confounder would pass a test for statistical significance at the 0.07 or 0.10 or even the 0.15 level. Why, beyond habit and convention, 0.05 is chosen as the maximum cut-off rate for hypothesis testing isn't clear at all. It all depends on the statistical and practical case in hand.)
(vi) Back to the Diagram
The example in that diagram plots three logit regression functions --- which, remember, thanks to the logit identity transformation, are linear in nature using logged odds. The outcome variable in the models discussed in this long chapter by Hosmer and Lemeshow is the presence or absence of CHD, cardiac heart disease, in a large sample selection of the public. The key risk-factor --- a term used a lot in logistic regressions in medicine --- is sex, set out as a dummy variable for men (male =1; 0 automatically = women); that means men are shown to be more at risk for CHD at any age beyond 30 than women are.
The independent variable represented in the diagram, the only other one in the 1st model besides sex is age. (See the reported effects for the coefficients and the statistical tests for model performance below.) Line 1
corresponds to the logit regression function for women as a function of age. Line or l2
represents the logit function or linear line for males. (The y axis plots the log odds of the risk of incurring CHD for men as opposed to women.)
Notice that if you look only at the first two logit models' regression lines, men have a higher incidence of incurring CHD after age 30 (the cut-off rate at the origin of X and Y axes) than women, and the same magnitude of difference in the risk of getting CHD exists right pass age 70. That means that the "relationship between age and CHD is the same for males and females. In this situation," Hosmer and Lemeshow go on to observe, "there is no interaction, and the log odds ratios for sex (male vs. female), controlling for age, is given by the difference between line l2
." The difference remains the same between the sexes for all ages. Men, though, are still more at risk at every age level pass 30 years.
Suppose, now, that instead the logit for males is represented by line l3
. The slope of the third line is different, noticeably steeper than the 2nd line, and that would indicate that "the relationship between age and CHD among males is different from that among females. When this occurs, we say there is an interaction between age and sex." As they note further, the estimate of the log-odd ratios for sex --- males vs. females ---"controlling for age is still given by the vertical difference between the lines, l3
, but this difference now depends on the age at which the comparison is made."
In short, "age is an effect modifier" on the dependent variable's behavior --- the presence or absence of CHD in this example.
As a result, a good logit model would have to be able to deal with the interaction effect of age and sex after age 30. An interaction term would be developed, then, for a new logit model that contained not just a constant term, SEX, and AGE, but also a new independent variable of SEXxAGE working in tandem, and then tested for statistical significance. Needless to add, the three logit models that Hosmer and Lemeshow then run on the sample selection have to take that finding into account.
The Resulting Report Would Look Like This
Three Logit Models With the Third Reflecting Interaction
"The results in Table 3.13 show evidence of both confounding and interaction due to age. Comparing model I to model 2 we see that the coefficient for sex changes from 2.386 to 1.274, an 81 percent decrease. When the age by sex interaction is added to the model we see that the change in the deviance is 8.034 with a p-value of 0.005. Since the change in the deviance is significant, we prefer model 3 to model 2,and should regard age as both a confounder and an effect modifier. The net result is that any estimate of the odds ratio for sex should be made with reference to a specific age."
By now, prof bug hopes, you have a pretty good working idea of what interaction terms are and how they can be used in logit modeling. About the only other theoretical point worth knowing for our purposes is that "risk-factor" as the main variable being modified by another variable --- in the Hosmer-Lemeshow example, sex being modified by age and both being treated as a separate interaction variable --- is a term generally restricted to the health sciences. In fact it comes from epidemiology. James Jacard, who has helped illuminate the use of interaction terms in both linear and logistic regression, prefers the use of the term "focal variable" in place of risk-factor. [James Jacard, Interaction Effects in Logistic Regression (Sage Publications, 2001)]
(vii) Pape's Interaction Term to Center-Stage Again
Back then to Pape's use of an interaction term for nationalist rebellion working in tandem with religious differences, where nationalist rebellion is the "focal variable" whose influence on the occurrence of suicide terrorism will vary according to whether religious differences enter into the picture.
Specifically, if you look at Pape's [flawed] data-set used for logit modeling in Appendix II (pp. 265-67 in Dying to Win), there are 23 cases of armed rebellion that he codes among the 58 democratic military occupations carried out by democratic countries between 1980 and 2003's end. Suicide terrorist groups emerge only in 9 of these 23 cases, and one of those 9 in Pape's data-set didn't involve religious differences: Turkish Kurds fighting Turkey's government and attacking its civilian population. [Recall here that Pape's data-set is very very flawed: it ignores 10 other suicide terrorist groups that all involved Islamist terrorists carrying out suicide attacks against Muslim governments and populations in the same time-period.]
The problem with Pape's model --- as we'll see in detail --- is that interpreting the reported coefficient (parameter value) of the interaction term is far more complex in logistic and other forms of non-linear regression than it is in linear regression.
Well, There's a 99.99% Likelihood That Pape's Reported 2nd and 4th Logit Models Are a Mess
If, in turn, the interaction term's reported coefficient is misinterpreted, its "effect" or influence on the behavior of Pape's dependent behavior --- the frequency with which suicide terrorism occurs or not --- will be unreliable, and any test for its statistical significance will be unreliable too: in technical language, its standard error (the standard deviation of the error term) will be unreliable, and so will a simple t-test on the coefficient of the interaction term. The same is true for any Wald test for the coefficient's statistical significance. A mess then results for the overall logit model's estimated coefficients generally.
How so? Well, in contrast to the interaction effect in linear regression, the interaction variable's effect on the dependent variable is conditional on the other independent variables. In fact, for reasons that we'll examine later, the reported interaction variable's coefficient may even have the wrong sign. And, to get down to nitty-gritty, there's a 99.99% likelihood that Pape misinterpreted his interaction term with these results for his reported logit models.
How can prof bug be so cocksure?
Because, quite simply, a well-known specialist in all forms of non-linear regression --- who examined all the articles published in the most highly regarded economics journals, 13 in all, between 1980 and 2000, that used either logistic regression or other modeling forms of non-linear regression (probit, tobit, and neg.bin) --- and guess what: all 72 articles misinterpreted the coefficient on the interaction term.
Yes, all 72 economic articles --- 100% of the total misinterpreted the coefficient value of the interaction term. What chance would you give Pape that he avoided a similar fate?
No need to delve into this here. It will be taken up again and clarified --- along with the proper statistical test for the interaction variable --- in Part Six..
THE PROBLEMS WITH PAPE'S SAMPLE SIZE
How Big Should Pape's Sample Set Have Been?
Well, there's no full agreement among logistic regression specialists on this question, though everyone would agree that the larger the sample set, the better.
What's more, there are three different methods for measuring the necessary sample size:
1. How big must it be for maximum likelihood estimation --- the way in which logistic regression ordinarily calculates the "effects" of variables --- to work properly. You will see figures as low as 50 or 100. What can be said with more assurance here are two things: 1) Either Pape's sample of 58 cases should most certainly have used a "score test" for model performance in addition to maximum likelihood estimation reflected in the model chi-square test (-2LL). 2) Or --- better yet, given the progress made with logistic regression for small samples in the last several years --- he could have opted to use what are called "exact methods" as a means of estimation, followed by some other model performance test such as the Hosmer-Lemeshow.
2. How many cases for each estimating or independent variable must there be? The general agreement here is 10 cases per estimator at a minimum. In that case, Pape's sample set would pass muster.
3. How many cells are needed when the estimators are categorical ---that is, qualitative dummy variables as n Pape's case--- and reported in a contingency or classification table of the sort that Pape uses on p. 99?
Stay with this last method a few moments.
Observe that the table on p. 99 has been reorganized by Pape to become a 2x2 table. If you've forgotten, go back and look at its report as set out in the introductory comments today. The table reflects the classified success by Pape of his 2nd logit model's observed outcomes of the dependent variable's behavior --- the frequency with which suicide terrorism was observed occurring or not compared to the original baseline model with the constant or intercept term only. It's that baseline or null model that in logistic jargon "predicts" what the frequency of observed outcomes will be. It's the 2nd logit model with three new estimators or "predictors" added to the baseline model --- nationalist rebellion, religious difference, and the interaction term for both --- that does the "observing". A success rate is then calculated. Pape likes to call this "predictive" success. Even if all the other problems with his logit modeling and data-set were flaw-free, that has nothing to do with prediction in the dictionary or scientific sense of the term regarding specific future terrorist attacks.
More to the point, we can conjecture that Pape's 3rd logit model with three estimating variables would have originally needed to be reported in a 3x3 contingency table, and his 4th logit model --- the one with the salvation variable --- in a 4x4 table. Stay with the latter 4th model. The observed cases of suicide terrorism's occurring or not would be reported in a table that looks something like this:
Hypothetical Classification Table of Pape's 4th Logit Model
Using A Standard Cut-Off Point of 0.5
|Observed ||Predicted |
|Nationalist Rebellion || Religious Difference ||Nat Rebel * Relig Differ ||Occupier |
|Percent Correct |
|Nationalist Rebellion || || || || || |
|Religious Difference || || || || || |
|Nat Reb * Relig Differ || || || || || |
|Occupier Concessions || || || || || |
|Overall Percentage || || || || || |
Why Pape's Reported Models Use Too Small A Sample Size
Look at the diagram again. Note that there are 16 cells that would be needed for estimating the observed number of suicide-terrorism's occurrences compared to its non-occurrence. How large, then, should Pape's sample size have been?
For Hosmer and Lemeshow, Pape's 4th table would have required 10 cases or events per cell or a total of 160 cases . . . at any rate, where the "distribution of discrete [read: qualitative] covariates is weighted heavily to one value." His 2nd logit model with 3 estimators would have needed 10 x 9 or 90 cases in all --- roughly 50% more than his data-set of 58 cases.
That quoted proviso definitely fits the outcomes of Pape's 2nd and 4th models. In particular, of the 58 total cases in his data-set, only 9 terrorist cases exist. If you are still unsure, look again at the reported number of times suicide terrorism occurred in Pape's table as set out in our introductory comments. There are 7 observed occurrences of suicide terrorism in the upper-left cell, and once each in two other cells --- a total of 9.
What follows? Clearly, the outcomes heavily favor the non-occurrence of suicide terrorist groups emerging among the 58 cases of democratic suicide terrorism. In turn, then, even someone with no more information about the topic than that suicide terrorist groups are rare in the world could come close to matching Pape's observed and reported results by simply estimating that 100% of his 58 cases of military occupation would show NO suicide terrorist group ever materializing.
Suppose, to go on here, you've guessed that there would be a 100% non-occurrence of suicide terrorism in advance. To look at Pape's reported results for his 2nd logit model, you wouldn't have been off much. Suicide terrorist groups never showed up in 49 of the 58 cases in Pape's data-set --- a percentage frequency outcome of 84%. At best, then, if that were the focal cell of interest in Pape's reported "predictive success" on p. 99, his logit modeling would have improved by only 16% over a totally naïve guess or prediction. In summary, the distribution of Pape's "discrete covariates" is definitely "weighted heavily to one value."
So if Hosmer and Lemeshow are right, Pape's 4th reported logit model --- which you'll see leads him to boast that it can account for 100% of the reported results in the upper left-hand corner of his 3x2 contingency table --- would reflect thoroughly unreliable maximum likelihood estimation. None of its reported results could be regarded as accurate. As for his reported 2nd model, it would have to have originally used a 3x3 contingency table with 9 cells, and that would have required 90 cases . . . still a good 50% or more higher number, as we just noted, than exists in Pape's data-set of 58 cases.
A Less Demanding Standard
Note that Hosmer and Lemeshow do add, on p. 347, that "research is needed to determine if 10 [per cell] is too stringent a requirement" for cases like Pape's logit models. Even so, his last 4th model would fail to meet a minimum size by a more generous standard.
Enter here the less stringent view of Alfred Demaris, another logistic regression theorist. In a Sage University Paper published in 1992 --- Logit Modeling: Practical Applications, he argues for a less stringent minimal number of cases in a double way for a classified logit model: there should be an average number of 5 cases per cell. As you can see, that's far less demanding than Hosmer and Lemeshow's requirement that there be 10 cases for each cell when the independent variables are qualitative.
All the same, even on Demaris's less stringent metric, Pape's reported 2nd logit model on p.99 would need 45 cases, which would be OK if Demaris is right. His 4th logit model with 16 cells would need 80 cases, though, and not result in accurate maximum likelihood estimates. Pape's last logit model, then, would need a larger data-set than Pape's existing one. Otherwise, the normal way in which logit modeling estimates the behavior of the dependent variable and the coefficients of the independent variables --- maximum likelihood estimation, used probably 99% or more of the time in all software packages --- will produce inaccurate and unreliable estimations here.
But note how a question quickly rears up here:
Is a Small Sample-Size Like Pape's Irretrievable for Good Logit Modeling?
No, not really. But it's 99% likely that Pape didn't resort to the necessary alternative for logit estimation.
In particular, small sample sizes can still work with a final fitted regression model when it turns out that large sample assumptions built into maximum likelihood estimates aren't reliable. They can do so by use of an alternative way of model estimation --- it's called exact methods --- that has been improved on for several years now to the point it can be regarded as a decent substitute in such cases.
No point in analyzing these methods here. The topic is complicated, and in fact was considered impractical before some efficient algorithms were pioneered in the last few years. Those interested will find a clear discussion of LogXact 4 for Windows 2000 and a practical application of it to one of their case-studies in Hosmer and Lemeshow, pp. 330-339. (In the next few pages after 339, moreover, they specify some tests for deciding what sample size would be needed for a particular logit model. We've referred to their views about the number of cases per variable and the number of cases for qualitative variables that produce heavily unbalanced observations of the dependent variable's behavior.)
Our buggy judgment? To repeat again, there's a 99.99% probability that Pape didn't use these exact methods, and probably a 90% probability that he didn't use a score test estimator for model performance either to compare it with the Gm or loglikelihood ratio test's reported result to see if maximum likelihood estimation would work properly..
Pape Selectively Uses His Flawed Data-Set To Avoid Constructing the Most Intuitively Correct Logit Model
Briefly put, whether out of theoretical bias, ideological bent, or plain garbled thinking, Pape has specified first a theory and then a logit model that collide with the most intuitively evident way to frame a theory and then test it by logistic regression. How so?
Look at his first table on p. 15, which we reproduce here. Ignore the totally inexcusable absence of more than 10 Islamic terrorist groups in the data, focusing instead on what should be self-evident even in this error-blasted data-set as it stands:
|SUICIDE TERRORIST CAMPAIGNS 1980-2003 |
|Date ||Terrorists ||Religion ||Target Country ||Attacks |
|1983 ||Hezbollah ||Islam ||US,France ||5 |
|1982-85 ||Hezbollah ||Ialm ||Israel ||11 |
|1985-86 ||Hezbollah ||Islam ||Israel ||20 |
|1990-1994 ||LTTE ||Hindu/secular ||Sri Lanka ||15 |
|1995-2000 ||LTTE ||Hindu/secular ||Sri Lanka ||54 |
|1994 ||Hamas ||Islam ||Israel ||2 |
|1994-95 ||Hamas ||Islam ||Israel ||9 |
|1995 ||BKI ||Sikh ||India ||1 |
|1996 ||Hamas ||Islam ||Israel ||4 |
|1997 ||Hamas ||Islam ||Israel ||3 |
|1996 ||PKK ||Islam/secular ||Turkey ||3 |
|1999 ||PKK ||Islam/secular ||Turkey ||11 |
|2001 ||LTTE ||Hindu/secular ||Sri Lanka ||6 |
Ongoing Suicide-Terrorist Campaigns At The End of 2003Untitled Document
| Date || Terrorists || Religion || Target Country || Attacks |
|1996- ||al-Qaeda ||Islam ||US, Allies ||21 |
|2000- ||Chechens ||Islam/secular ||Russia ||19 |
|2000- ||Kashmirs ||Islam ||India ||5 |
|2000- ||several ||Islam/secular ||Israel ||92 |
|2003- ||Iraq Rebels ||Unknown ||US, Allies ||20 |
| 111111ATTACKS NOT PART OF ORGANIZED CAMPAIGNSPart of Organized Campaigns || 1414 |
|Total Incidents || ||315 |
What you can't help noticing right off is that the term Islam appears with far greater frequency than Hindu or secular or Sikh in column three. And despite Pape's strange way of dividing terrorist campaigns in column two --- strange, but typically a smokescreen for Islamic dominance in religion --- you can count the number of actual suicide terrorist groups in that second column and quickly see that there were 9 such groups in all, with Muslims manning 7 of those. Just think: 7 out of 9 terrorist groups were Islamic, and Pape entirely ignores this self-evident statistic --- which pratically bombards your eyes --- and decides that what really matters in theorizing is the number of total suicidal attacks launched by these 9 groups.
1. When, after this buggy article was written and prof bug had more time to try penetrating the mirror-illusions in the Pape's table 1 --- not, he can assure you, an easy task what with the cover-up distortions --- he found that the actual number of suicide terrorist groups there amount to 15, a good 13 of which were Islamic or 88% of the total. (These numbers and the percentages, note quickly, include four unnamed Islamic terrorist groups attacking Israel not referred to by Pape anywhere in the table: Islamic Jihad, Fatah, Popular Front for the Liberation of Palestine, and Forces of Palestinian Popular Resistance; and a fifth --- yes, only one overall Islamic group on a very very conservative count by prof bug --- for "Iraqi" rebels, whose religion Pape claims to be in the dark about, the poor fellow . . . though he is very certain that the death-dealing Kaboomers are all Iraqis.
2. Somehow, as prof bug also showed later, Pape forgot to count another 21 suicide terrorist groups active in this period --- or 36 in all, 34 of which happened to be Islamic: 94.4%.
For a corrected buggy table that shows how Pape omitted 20 groups involved in suicide terrorism between 1980 and the start of February 2004, all of them, oddly, Islamic in religion, click here .
As a result, the percentage of Islamic terrorist groups active in suicide attacks between 1980 and the start of 2004 rises from the Pape cover-up table of 88% to 94.4% . . . with only two of the total 36 suicide terrorist groups not Islamic. One of them, the Sikh BKI, committed exactly one suicide attack (in India) and the other, the Hindu-cultist Tamil Tiger LTTE Kabooming in Sri Lanka, far more busy to be sure, but never ever attacking anyone outside that tiny island country. Yes, no attacks even in India despite the Indian army's vigorous participation in trying to help the Buddhist-dominated Sri-Lanka democratic government contain or crush the LTTE.
As it happens, Pape makes sure that you can quickly spot what his biased organization of the data intends you to see: on the bottom line, the number of total "incidents" or attacks --- 315 in all. Notice that he doesn't want you to see instantly the total number of suicide terrorist groups in column 2 or the total number of Islamic groups (9), Hindu/secular (1), or Sikh (1). He doesn't want you to wonder, we can infer, whether or not Islam or some variants of it might not be the determining cause of suicidal Kaboomings.
We, however --- much more alert and openminded to reality than Pape --- wonder why the heck the number of attacks is the crucial figure to focus on. In that case, the Tamil Tiger LTTE carried out a total of 75 such attacks (nicely perfumed by Pape in the bottom row as "incidents"). Even then, sticking with this strange focus, we quickly substract 75 from 315, and we're left with 240 suicidal attacks, of which 239 were generated by Muslim terrorists. Might not some of us, accustomed to thinking in statistical terms when relevant, then wonder whether the LTTE number isn't something exceptional in the data . . . an outlier to be more precise?
The Tamil Tigers Outlier-Role
To make sure you are beginning to grasp what the LTTE outlier does if the outcome variable is to be the number of suicidal attacks, note first off that Sri Lanka is a tiny island country of 20 million people where the terrorist groups operate like a religious cult anyway --- not that Pape knows this or cares to let you know it ---and have never attacked Indian citizens outside the island itself. This is the case, mind you, even though India's democratic government has often sent in large numbers of troops to help the Buddhist ruling government, itself democratic, fight the Hindu terrorists in the eastern part of the Island. In this light, to give such prominence to one group that commits almost a quarter of the total suicidal attacks between 1980 and 2004, while differing in substantive nature from 7 of the remaining 8 groups, seems . . . well, either silly, garbled, or ideologically motivated.
The result is patently dubious and at odds with common sense. Instead of focusing on the self-evident fact that 7 or his 9 suicide terrorist groups active between 1980 and 2003 were Islamic, he sets out theoretically to show that religious ideology in and of itself can't possibly be a major independent influence, then specifies the numbers of suicide attacks as the outcome of interest for subsequent theorizing, and then -- to top it off --- constructs a series of logit models that use the wrong independent variables: not religious ideology per se, but nationalist rebellion and religious difference and the two as an interaction term.
To Make Sure You Understand the Manipulations Going on Here, Consider These Three Points:
First, Pape could have constructed a logit model using Islam (or Islamism) as a main-effect independent variable, and run a variety of diagnostic statistics to test for outliers. (The other non-Islamic terrorist group, Sikh in nature, carried out only one attack.) Secondly, and more importantly, he could have just used common sense if he weren't biased or garbled in his thinking and seen why the Tamil Tigers could be treated as an outlier.
Second, imagine that there's another small island country in the Indian Ocean called Bongo Longo where the ethnic minority of Longo-Longos has grievances against the policies of the dominant Talky-Lions despite the open democratic elections for 50 years won by Talky-Lions's various parties. Suppose further that the Whammy Cat terrorist group emerges among the Longo-Longos and starts Kabooming into oblivion lots of Tamil-Lion citizens. Let's say, to drive our point here home, that the Whammy Cats carried out exactly 20,000 suicide terrorist attacks against the Talky-Lion population between 1980 and 2003, and they make no bones about it: their religion requires Talky-Lion victims.
If we now entered the Tamil-Lions into Pape's initial data-set laid out in table 1 on p. 15 of his book, what would immediately strike your eye as a reader?
Well, whereas there were 315 recorded suicide attacks in the column labeled #Attack, we would now have a total of 20,315! And if suicide attacks were to remain the focal interest of Pape's subsequent theorizing and statistical testing, he would have to show logit-modeling results on p. 99 that clearly underscore the overwhelming influence of Longo-Longo religion and Whammy Cat terrorism as the major cause of suicide terrorism in the world between 1980 and 2003. After all, they're now responsible for 99% of all suicide terrorist attacks in that interval. What's more, prof bug can assure you: such a logit model of this one-variable influence would achieve a fabulous rate of predictive success very near to 100%!
Would we then expect Pape then to spend 250 pages developed the statistically tested Whammy Cat/Longo-Longo Theory of Suicid Terrorism?
You get the idea what an outlier can do in such instances? Transformed into a hypothetically dominant case, it can play havoc with any fitted regression model.
And third, as our coup-de-grace to Pape's back-and-forth sleights of hand, he only ignores the number of suicide terrorist groups as the key output variable in that table 1 on p. 15 . . . followed, of course, by four and a half chapters of theorizing on his pet outcome, the number of suicide terrorist attacks that manages, in the process, to conceal the dominant Islamic role in suicide terrorism since 1980. But get this! Are you ready? Believe it or not, when we reach the last few pages of Chapter 6 and Pape trots out a new data-set --- which he creates, codes, classifies, and certifies on his own --- the outcome variable that actually is regressed on in his statistical work is none other than "whether a suicide terrorist campaign [read: one per group]" emerged or not in the 58 cases of democratic military occupation between 1980 and 2003!
You follow? You get the picture?
Back on p. 15, Pape ignores the self-evident outcome of concern: the number of distinctive suicide terrorist groups active in that 23 year period. That way, he can conceal the dominant causal role in suicide terrorism --- which any open-minded scholar would say is what should be our "main effects" independent variable or estimator. Instead, he goes on and theorizes in line with his biases. In the process, he contrives a theory of nationalist rebellion that's aggravated by religious difference between occupier and occupied as the "main causal pathways", then uses these two latter "causal pathways" as independent variables for logit modeling. These become his "main effects" predictors or estimators in his 2nd and 4th logic models that seek to "predict" whether suicide terrorism is observed to appear or not appear on each of his 58 cases.
Worse yet, Pape not only is able in this sneaky way to ignore the religious ideological thrust of these suicidal terrorist groups --- 9 in all, remember: 7 of them Islamic --- he further bleaches the Islamist nature of suicide terrorism by setting up a bogus data-set that looks at the military occupations carried out only by democratic governments. The result should be clear by now. Pape's data-sets in Chapters 1 and 6 are manipulated and deformed to fit his biases, both theoretical and statistical, whenever it suits him. Specifically, what he rejected as the dependent variable that was self-evident to us in table 1 --- the number of suicide terrorist groups (9 in all) --- is brought back into play ifor use in statistical modeling in chapter 6.
What Should An Intuitively Specified Logit Model Look Like If Suicide Terrorist Groups --- Not Attacks --- Had Been the Dependent Variable From p 15 On?
Most likely like this:
Y = a + b1X
a = the constant term or intercept
Y = whether suicide terrorist groups emerge (Y = 1) or don't emerge (Y = 2)
b1X = Islam Is Present Or Not (b = 1 or b = 0)
And that's it. This would, of course, be Pape's 2nd logit model, with a null or baseline model --- no independent variables present, only the intercept term -- run first and then tested with -2LL and maybe other statistics. Whether this 2nd logit model would become the final fitted logit model by a researcher would depend on the successful results of statistical testing for model performance ( or goodness-of-fit )as well as tests of the one independent coefficient, Islam present or not, and if you want its success rate for classified or predictive purposes.
One thing is pretty clear even now though --- without this specified model ever been applied to Pape's data. It would almost certainly produce a better "sensitivity" rate of predictive success than Pape's reported 2nd logit model on p. 99 --- the frequency with which suicide terrorism is actually observed to occur as opposed to what the null model predicts. And with even greater certainly, we can say that the PPV --- positive predictive value --- would be far far higher than Pape's mediocre 8 out of 29 predicted cases (or 8 out of 23 such cases) . . . our hesitation here due to one fact alone: we don't have access to how Pape's logistic software estimated and reported the predictive rates for overall success, sensitivity, selectivity, PPV, NPV, false positives, and false negatives.
Finally, To Make Sure You Understand How Flawed Pape's Table One Is . . .
We will reproduce the corrected and accurate buggy prof table once more for comparison.
All of the attacks, you'll notice, are by Islamist terrorist groups.
SUICIDE TERRORIST GROUPS OMITTED BY PAPE, ALL ISLAMIC
|Date ||Terrorists ||Religion ||Target Country ||#Attacks ||#Killed & Wounded |
|1981 ||Egyptian Islamic Jihad ||Islamist ||Egypt ||1 ||1 k (Pres Sadat); 12 w |
|1983 || El-Dawa ||Islamist ||Kuwait ||1 ||? |
|1985 || El-Dawa ||Islamist ||Kuwait ||1 ||? |
|1995 || Egyptian Islamic Jihad ||Islamist ||Pakistan ||1 ||16k; 60 w |
|1992 ||Hezbollah ||Islamist ||Argentina ||1 ||29 k; 242 w |
|1994 ||Hezbollah ||Islamist ||Argentina ||1 ||85 k; 300 w |
|1994 ||Anser Allah ||Islamist ||Panama ||1 ||21 k; |
|1995 ||GIA: Armed Islamic Group (Algeria) ||Islamist ||Algeria ||1 ||42 k; 265 w |
|1994 ||GIA ||Islamist ||France ||1 ||aborted plane bombing of Eiffel Tower |
|1995 ||Al-Gama'a al-Islamiyya ||Islamist ||Croatia ||1 ||? |
|1997 || Al-Gama'a al-Islamiyya ; Jihad Talaat al-Fath. ||Islamist ||Egypt ||1 ||62 k; 19 w |
|2201 ||Jaish-e- Muhammed, ||Islamist ||India ||1 ||? |
|2002 ||Al Qaeda ||Islamist ||Tunisia ||1 ||19 k |
| 2002 || Jemaah Islamiyah ||Islamist ||Indonesia (Bali) ||1 ||202k; |
|2003 ||Jemaah Islamiyah ||Islamist ||Indonesia ||1 ||12k; 150 w |
|2002 ||Al-Qaeda linked Somalis ||Islamist ||Kenya ||1 ||13k; 80 w |
|2003 || MILF: Moro Islamic Liberation Front ||Islamist ||Philippines || ||21 k; 150 w |
|2003 ||GIMC Moroccan Combatant Group ||Islamist ||Morocco ||1 - 5 ||45k |
|2003 ||Three or More Sunni & Shiite Suicide Groups ||Islamist ||Pakistan ||3 ||? |
*This abortive suicide bombing by a hi-jacked airline --- GIA out of Algeria the terrorist group in charge of the mission --- is singled out because it was intended, as the captured terrorists admitted, to crash into the Eiffel tower full of tourists. The mission failed when the French pilot claimed he had to stop in Marseilles to refuel, and French commandos stormed the plane,
**It's not clear how to list the 5 separate but simultaneous bombings in Casablanca. It depends on how Pape would define them: as 1 attack or 5.
THE TECHNICAL PROBLEMS WITH PAPE'S LOGIT MODELING EXAMINED IN DEPTH
Well, considering that by this point your patience is probably being overly tested --- and, when you get down to it, so is prof bug's intellectual stamina --- it seems a good idea to end today's buggy article right here. You can expect Part Six to be published on the buggy site in a new article by the end of this week.