[Previous] [Main Index] [Next]

Saturday, October 22, 2005

ROBERT PAPE TESTS HIS THEORY OF SUICIDE TERRORISM STATISTICALLY: 4th of a 7-article Series

This is the 4th article in an ongoing series --- now several weeks old --- on an influential book on suicide terrorism that Robert Pape, a political scientist at the University of Chicago, has written. What with the length of the argument that unfolds here, note that it will be divided into two separate articles. Today's argument ends at the start of Part Three, which will be published on this site in a day or two.



Pape's Book and Argument

Published earlier this year, Dying to Win: The Strategic Logic of Suicide Terrorism examines the total number of suicide terror attacks that occurred between 1980 and the end of 2003 --- or so Pape argues; it divides these into 18 different suicide terror campaigns carried out by 11 different organized terrorist groups --- 9 of which are Islamic, just as 14 of the 18 terror campaigns are Islamic in nature; Pape contends nonetheless that religion per se, let alone Islamist groups, have no direct motivating force in suicide terrorism . . . the main motivation behind all 18 of these terror campaigns being clearly nationalist instead. In particular, suicide terrorism is a rational if desperate last-resort of the weak and oppressed to force a alien occupying military power off its territory.

On Pape's view, then, the urge to nationalist liberation and self-determination is the driving force in the suicide terror practiced by these 11 different terrorist groups. Religion, to the extent it play any role here, does so indirectly and with secondary impact. Tersely put, if the religion of the occupied people happens to differ from that of the occupying power, then the stakes in nationalist rebellion against the occupier are raised, the citizenry of the occupying country is demonized, and suicide terrorism becomes much more likely.

How Today's Buggy Article Unfolds

Today's article divides into into four parts, each briefly outlined here to help you keep track of the various twists-and-turns in the overall argument.

(i.) There's no need to say anything more about Pape's nationalist theory of suicide terrorism in this introductory section, and for a solid reason: as you'll soon see, Part One will set out the theory in clear outline form, followed by a schematic diagram of its causal links that is taken from p. 96 of Dying to Win.

One hint intrudes here. When you examine Pape's theory, recall what the two previous buggy articles on Pape's book showed about his work: tersely put, his data-set for the number of suicide terrorist attacks and campaigns are noticeably amiss. Whether intentionally or not, Pape's omissions and the related effort to sneak in several Islamist suicide attacks aimed at mainly moderate or pro-Western Muslim countries have the effect of whitewashing the clear connection between the overwhelming number of suicide terrorist groups and radical Islamist ideology.

All told, there weren't just 9 Islamist terrorist groups active in suicide attacks between 1980 and 2003 out of a total of 11 such groups, rather 20 out of a total of 22 such groups. What's more --- something that contradicts a key component of Pape's theory --- most of them weren't motivated by nationalism, but rather by an effort to disrupt Muslim governments and incite a revolutionary overthrow as a prelude to installing a system of purified Islamist rule . . . most likely resembling (for reasons stated later) the totalitarian-like regimes of Taliban Afghanistan or Saudi Arabia or the Sudan mass murdering rulers or the fanatical Sharia-intoxicated imams and sheiks in Northern Nigeria.

Here, in concrete detail, is the buggy table showing the suicide terrorist groups and attacks that Pape ignored or glossed over or possibly concealed.

Suicide Terrorist Campaigns Missed By Pape 1980-2003 Untitled Document
Date Terrorists Religion Target Country #Attacks #Killed & Wounded
1981 Egyptian Islamic Jihad Islamist Egypt 1 1 k (Pres Sadat); 12 w
1995 Egyptian Islamic Jihad Islamist Pakistan 1 16k; 60 w
1992 Hezbollah Islamist Argentina 1 29 k; 242 w
1994 Hezbollah Islamist Argentina 1 85 k; 300 w
1994 Anser Allah Islamist Panama 1 21 k;
1995 GIA: Armed Islamic Group (Algeria) Islamist Algeria 1 42 k; 265 w
1994 GIA Islamist France 1 aborted plane bombing of Eiffel Tower
1997 Al-Gama'a al-Islamiyya & Jihad Talaat al-Fath. Islamist Egypt 1 62 k; 19 w
2002 Al Qaeda Islamist Tunisia 1 19 k
2002 Jemaah Islamiyah Islamist Indonesia (Bali) 1 202k;
2003 Jemaah Islamiyah Islamist Indonesia 1 12k; 150 w
2002 Al-Qaeda linked Somalis Islamist Kenya 1 13k; 80 w
2003 MILF: Moro Islamic Liberation Front Islamist Philippines   21 k; 150 w
2003 GIMC Moroccan Combatant Group Islamist Morocco 1 - 5 45k


*This abortive suicide bombing by a hi-jacked airline --- GIA out of Algeria the terrorist group in charge of the mission --- is singled out because it was intended, as the captured terrorists admitted, to crash into the Eiffel tower full of tourists. The mission failed when the French pilot claimed he had to stop in Marseilles to refuel, and French commandos stormed the plane,

**It's not clear how to list the 5 separate but simultaneous bombings in Casablanca. It depends on how Pape would define them: as 1 attack or 5.


As you'll see later on today, this initial data-set used by Pape isn't the only dubious set in his book, far from it. So is the one he generates for statistical purposes in chapter 6. So too is the data-set he uses in chapter 7 --- that chapter devoted entirely to showing that al Qaeda's hundreds of suicide attacks world-wide are in line with his nationalist theory of suicide terrorism --- where he applies a convoluted statistical test to show that al Qaeda's known suicide terrorists don't hail from Muslim societies where radical Islamist fundamentalisms flourish.

(ii.) Enter our main concern today in Part Two: Pape's statistical effort to test the causal links in his nationalist theory of suicide terrorism, using a logit regression model for that purpose.

The model's estimated results and Pape's interpretation of them are, as you'll see, beset by numerous problems --- at any rate, to the extent that he gives us much information about the logit model, which isn't much at all. All we find our about the nuts-and-bolts of the model's specification and the various statistical tests he ran on the estimated results is some scant report(tucked away in fn. 43 on p. 294), or about the classification scheme he uses for reporting those results on p. 99. Even so, we can be pretty confident about these problems.

That said, note that Part Two will be mainly devoted to s non-technical effort to clarify what logistic regression or logit analysis or modeling amounts to. (The two terms are interchangeable.) Those of you familiar with logit need only skim through that section at a galloping rate. Those of you who aren't familiar with the technique --- or with regression statistical techniques of any sort --- will benefit, prof bug hopes, from reading those comments carefully. By the time you're through Part Two, you won't, needless to add, be a whiz¬-like specialist in logit modeling; but you should at least have enough of a decent working idea of the topic that you'll be able to follow the more focused buggy criticisms of Pape's statistical work in Part Three.

(iii.) Part three brings us to the core of the buggy argument, devoted entirely to appraising Pape's statistical work . . . which, as you'll see, are beset by all sorts of numerous problems, some technical, others more substantive.

We'll look first at Pape's data- set that he himself gathered, coded, and classified. All sorts of troubles abound here.

Then they'll look at how the different three models of logit analysis that Pape runs --- only the first, by the way, is reported even in scant detail by him --- and seehow the use of an interaction term for one of the independent variables (don't worry for the moment what this means) is almost certainly misinterpreted by him. Because of this misinterpretation, any reported statistical tests of significance are doubtful. (Don't worry for the time being about the technical terms in this and the next three paragraphs. They'll be clarified later.)

There's more trouble.

We'll also see that the estimated effects of these independent variables in his models on the behavior (values) of his dependent variable --- namely, whether suicide terrorism occurs or not --- are very likely entangled in what's called a joint loglikelihood. That's a common problem in logistic regression when a data-set sample is not probabilistically selected --- by random means or near-random selection or by an estimated natural process. Pape's population and hence his sample of data that he runs his logit model's conditional variables on don't reflect probability sampling at all. They're both what's called state-dependent sampling, and more specifically, a retrospective case-study sample. In the upshot, if Pape's logit model were run backwards --- with the independent and dependent variables' roles reversed --- the same odds ratio (a key measure of a logit model's estimated coefficients) would result as when the models were run in the "causal" way Pape intended . . . at any rate they would be unless Pape took corrective action.

Finally, unless Pape used SAS software and not SPSS --- prof bug isn't sure about the other main statistical software widely used by researchers (Stata) --- there will also certainly be a bias in the classification results no matter how they're interpreted. Much worse, Pape as we'll see misuses the reported results in his 2x2 classification scheme. He claims that these results of his logit modeling show that his nationalist theory of suicide terrorism "correctly predicted" 49 of 58 cases. The theory does no such thing. What the logit model has done is --- ignoring all the other problems we've just touched on --- is correctly classify 49 of 58 cases. Such classification results in logistic regression can't be used for wider theoretical purposes; they are good only for one thing: for confirming correct classification, nothing else.

If Pape's theory could do what he claims his 2x2 classified logit results indicate, he would have achieved a degree of predictive accuracy far greater than any economic forecasting model has ever achieved --- and there are several different forecasting models used in the USA alone, all run on large data-bases of quantitative data unlike Pape's small sample which is derived in non-probability ways . . . a point, as you'll see, that's fully documented and hammered home later on here.

(iv.) Part Four will then summarize whether the following claims that Pape himself makes about his statistical work on pp. 96-97 --- right before he tells us how he coded and assembled his data-set and reports the logit model's estimated (conditional) results --- are justified or not. To wit:


"To test my theory, I employ a methodology that combines the features of focused comparison and statistical-correlative analysis using the universe --- [read: total number] --- of foreign occupations, 1980-2003. Correlative analysis of this universe enhances confidence that my theory can predict future events by showing the patterns predicted by the theory occur over a large class of cases [58 in all]. Detailed analysis of historical cases enhances confidence that the correlations found in the larger universe are not spurious: the theory accurately identifies the causal dynamics that determine outcomes [of suicidal terrorism] " p. 96[italics and bold-type added].

Remember, the buggy argument about Pape's statistical work is divided into this and the next 3 articles That (5th) article is essentially done, waiting only to be formatted in HTML for the Web. The 6th buggy article will continue our in-depth analysis of Pape's nationalist theory of suicide terrorism, probing the four constituent parts --- alleged causal variables --- in that theory and showing how and why they're all unreliable. A 7th will examine all the numerous problems --- some real whoppers --- that bedevil Pape's efforts to apply his flawed nationalist theory of suicide terrorism to al Qaeda and related jihadist terror groups inspired by radical jihadist Islamisms.


 

PART ONE:
PAPE'S NATIONALIST THEORY OF SUICIDE TERRORISM SUMMARIZED


It's in Chapter Six of Pape's book (pp. 79-101) that a full-fledged theory of suicide-terrorism emerges for the first time.

To Pape's credit, his "nationalist theory of suicide-terrorism" is set out clearly there and discussed in useful detail. It consists of four component parts --- a set of independent or explanatory variables if you prefer, all intended to clarify the circumstances that will likely "cause" suicide attacks to be launched. We'll examine these four variables here, along with the useful schematic diagram of the alleged "causal" pathways generating suicide terrorist attacks that Pape himself presents on p. 96.

The Four Independent ("Causal") Variables Sketched In

"Sketched in" is the key here. The four independent (estimating) variables are set out in a fast, top-skimming manner, at any rate for now. The aim is to give you a rough working idea of each variable, followed by the schematic diagram; nothing more . . . not for today anyway. It's only in the next buggy article, remember, that each of these alleged causal links and pathways will be delved into at length.

As for the use of quotes around "causal", they're there purposefully. Pape's theory isn't a scientific theory in a strong sense as we'll see, rather more like a wiring diagram with arrows drawn in different pathways to signify hoped for "causal pathways" among the variables. And very few regression models, linear or non-linear, cannot test causally any theory except a few in the natural sciences.

As we'll see, even the most formalized economic regression-models running on strictly quantitative data that can be randomly sampled have never been tested accurately for any causal links --- even, believe it or not, neo-classical demand theory, the core of microeconomic theory. For a good 75 years now, econometricians have been trying to test the various empirical variables in demand theory --- "the symmetry of (compensated) Slutsky matrix, the homogeneity of degree zero of individual and aggregate demand functions, and Walrus's Law (adding up or Engle aggregation), and guess what? Almost all tested statistical models have produced results that contradict the theory of demand." (See the discussion of this on pages 96-98 in D. Wade Hands, Reflections Without Rules; Economic Methodology and Contemporary Science Theory [Cambridge, 2001). The quoted terms are from pp. 96-97)

All these points will be clarified later on in Part Three. For the moment, fasten your attention on the independent variables that comprise Pape's nationalist theory of suicide terrorism.

1) There's an alien military occupation on a territory of another people --- a national or ethnic community that resents it --- by a democratic country. Pape says that he can find no cases where suicide terrorism has been used against non-democratic occupiers.

Note in passing that this particular variable doesn't enter into Pape's logit model per se. It's not even an exogenous variable, operating from the outside on the logistic regression's estimates of the outcome variable. Instead, by definition, Pape limits his data-collection and coding to only democratic occupying countries . . . wrongly so, as we now know and as will be clarified in part three.

2) Sooner or later, a noticeable religious conflict between the occupying power and the occupied population has to develop that aggravates the locals' fears and resentments of the occupier and leads to demonizing its society and civilians. In turn, the demonizing will celebrate national martyrdom by armed rebels against the evil occupier, justifying the use, if need be, of suicide terrorism . . . the latter seen by more and more of the occupied people as a last desperate resort at coercing the alien democratic country to withdraw its military forces.

Note: though prof bug strive hard to refrain from critical comments of Pape's theory until the next buggy article, he can't fully adhere to this self-denying ordinance here.

More specifically, if Pape is right, the fact that one religion --- Islam, which constitutes most of the suicide terror campaigns between 1980 and 2003 even in his flawed and understated data-set that appears on p. 15 of his book --- has a 1300 year-tradition of armed martyrdom suffered in jihad against infidels, with the suicide attackers instantly entering Paradise the second after they're killed, has nothing to do per se with such demonizing or justification of suicide attacks. Nor does it have anything to do with Islamist terror groups carrying out 90% of the suicide attacks between 1980 and 2003, at any rate when Pape's whitewashed data-set is corrected. For that matter, neither does the tremendous number of other murderous actions carried out by fervent, militantly enraged sections of Muslim populations against Christian, Buddhist, animist, Hindu, or even Muslims designated by fundamentalist extremists as apostates, whether in Pacific Asia (Indonesia, Thailand, and Singapore), or in South and Central Asia, or in the Persian Gulf region, or in the Middle East and North Africa, or in the Sudan or Northern Nigeria.

You want clear evidence? Well, consider this: a table compiled daily by ReligionofPeace.comshows that --- since 9/11's massacres in New York and Washington D.C. --- there have been more than 2800 Muslim terrorist attacks around the world as of August 28, 2005. To repeat: in the four years since 9/11, Muslim-inspired terrorism has resulted in 2800 different attacks against overwhelmingly civilian targets.

Does anyone besides Pape and apologists for Islamist extremism --- only a minority of Islam's 1.2 billion people --- really believe this?


3) Before suicide terrorism is resorted to as a final desperate effort at national liberation, the growing nationalist ardor and rippling hatred of the occupier has to spawn an armed rebellion --- whether non-suicide terrorist attacks or urban or non-urban guerrilla warfare --- carried out by national resistance-movements against the foreign country's armed forces and civilians. There haven't been any suicide terrorist groups that have emerged between 1980 and 2003 without a prior national rebellion already in progress, and they start suicide attacks only after these other forms of armed rebellion have failed . . . or so it seems (Pape isn't clear on this point).

4) And finally --- as his statistical testing of the theory reveals at the very end of chapter 6 --- there's a lack of noticeable concessions by the occupying power to the occupied people's desires for national self-determination, in which case suicide-terrorist groups will very likely materialize. If, though, concessions are offered that the locals judge as ensuring local autonomy or holding out a prospect of it in the future --- even though the local autonomy will fall short at times of full-fledged national independence and sovereignty --- suicide terrorism will either not occur or fizzle out.



Oops: Almost Forgot That a 5th Variable Sneaks In the Pape Theory

Specifically, at the very end of chapter 6, Pape adds a 5th independent or estimating variable to his theory --- or more accurately, to his logit model. Small wonder. It is entered there in order to deal with a big regression problem: his classified results show that what is called in logistic regression the "sensitivity of prediction" --- which is strictly technical statistical term that has nothing to do with prediction of future events or behavior of any sort --- is a mediocre 50%. That is, the reported logit results placed in a 2x2 classification scheme --- which we'll set out in Part Two --- has been able to classify correctly when suicide terrorism will occur given the influences of the original four independent variables on it is no better than flipping a coin. Not that he tells us this, mind you.

What Pape does then --- which is acceptable enough in any regression modeling --- is seek to improve the estimated outcome here by running adding a new independent variable to the new logit model: called "concessions", it's another coarse category (to use a statistical term) that Pape estimates from . . . well, it's not clear. He cites several other studies as sources, but nothing more. In fact, the new variable is introduced at the very bottom of p. 99, clarified in lickety-split manner on p. 100, and that's that..

What does emerge when Pape codes the new 5th variable is that when democratic occupying powers offer concessions of some sort to actual or would-be suicide terrorist groups, they are then highly likely to cease their suicide attacks. Despite this, we are left with no clear idea at all of what level of concessions generally is needed to deal with suicide terrorist groups and appease them. The whole thing seems very ad hoc, an effort to salvage a regression model. And true enough --- again, leaving aside all the other problems that beset Pape's logit model that will be pinned down in Part two --- the revised logit model using the 5th variable improves the "sensitivity of prediction" regarding when suicide terrorism occurs is upped to 100% and the resulting logit estimates overall results are able to account for 56 of the 58 individual case-study data-points (observations) in his data-set.

A Schematic Diagram of the Pape Theory

On p. 96 of his book, Pape provides us with a diagram of his full-fledged theory's causal pathway. He allows for an alternative pathway running in the opposite direction as a control. What follows is taken directly from that page.

Pape's Model of Suicide-terrorism 1) Solid arrows represent the theory proposed in this book.
2) The dashed arrow --- running from rebellion to nationalism --- represents a casual path
that sometimes influences the production of national identity;
but that plays little role in determining when suicide-terrorism campaigns occur.
3) The dotted arrow represents a causal path that al-Qaeda and perhaps other terrorist organizations
hoped will occur,but that has not done so.


 



PART TWO:
BEFORE WE GET TO PAPE'S STATISTICAL WORK TO TEST HIS NATIONALIST THEORY OF SUICIDE TERRORISM, HERE ARE SOME BUGGY COMMENTS ON REGRESSION MODELING


Enter Logit (Logistic) Regression

To test his theory, Pape specifies a logit (non-linear) regression model and runs it on a data-set that he himself assembled.

The aim of any regression model, linear or non-linear, is to estimate statistically those factors --- called independent or explanatory or estimating variables (which can be manipulated) --- that influence the behavior of the dependent or outcome variable using probability analysis.

In Pape's logit model, the independent variables --- or estimators or (a strictly technical term used in logistic regression) "predictors" --- are established by his nationalist theory of suicide terrorism, and the dependent or outcome variable is the probability of suicide terrorism occurring or not, given the influence of the independent or explanatory variables. As the diagram at the end of Part One showed clearly, those independent variables are nationalist rebellion against a democratic occupying power, religious conflicts between the occupied nationals and the foreign occupier, and --- as he tells us on p. 99 --- these two "causal" variables working in tandem. And, to repeat, the dependent or outcome variable is whether suicide terrorism occurs or not, given these three explanatory or estimating variables when the entire regression model and its variables is run on a data-base that Pape has himself assembled.

 

Why Pape Can't Use Classic Linear Regression

(i.) Qualitative Outcome Variables

Look over what prof bug just said about Pape's dependent or outcome variable. It's strictly qualitative and, more specifically, a binary (or dichotomous) one: suicide terrorism either occurs when Pape applies his logit model to his data-set or it doesn't.
Linear regression can't handle a qualitative dependent variable effectively. All sorts of poor estimated results will occur if it's used --- something true even of a variety of linear regression that was developed to use on such a dependent variable that's known as loglinear regression. Logistic regression and logit analysis (or modeling) are fairly new, not more than two or three decades old. And though their estimates are much more complex to interpret and test statistically, the software for logistic regression has become increasingly easy to use, and is now by far the most used form of regression in the social sciences other than in economics.

(In case you're interested in understanding why economics stands out from the rest of the social sciences in not using logistic regression very much, it's because that discipline is the only social science where the data-samples --- and the populations from which the samples are drawn --- are overwhelmingly quantitative: GDP data, inflation data, employment data, prices of goods and services, productivity growth rates, and so on. Hence most of the time the dependent variable is strictly quantitative. Even if, as it happens, the estimated values of Y, the dependent variable, assume a non-linear form, most of the time the independent variables can be transformed mathematically and generate a linear distribution.

(Observe in passing, though, that the use of more and more panel data in econometrics --- usually small samples drawn from a much larger population of specific individuals, household units, firms, or various levels of government --- requires non-linear regression techniques too . . . mainly because the dependent variable is qualitative too: for instance, market research to estimate whether consumers will prefer to buy domestic or foreign vehicles, or sedans or SUV's, or prefer to drive to work as opposed to taking public transportation. And statistical medical studies --- such as estimating the effects of statin (cholesterol-lowering) drugs on the incidence of heart attacks among a sampled population --- have used logistic regression for two decades or more. In the latter case, the dependent or outcome variable would divide the sampled population into a dummy variable: the incidence of heart attack up or down: 0 or 1).


(ii.) Back To Pape's Need for a Non-linear Regression Technique.

In particular, what with a qualitative binary dependent variable, Pape specifies a logit model --- more precisely, three of them with scant information about the last two (yes, even scanter than for the first one he reports about) --- to test his nationalist theory's claimed "causal links". He then runs the model's variables on a sample selection (data-set) that he himself has to create. As you'll see in Part Three, his sample selection is equivalent to his assembled population of foreign occupation, national rebellion against it, religious differences present or not, and suicide terrorism occurring or not. As with all regression modeling, linear or not, his aim is to estimate the behavior of Y, his dependent variable, as the X estimating or independent variables on its behavior take on different values in response to the individual data-points or observations.

Note in passing that a binary qualitative variable like Pape's treats it as a dummy variable: when suicide terrorism is estimated to occur, it takes the value of "1", and when it is estimated not to occur its value is "0".

 

The Differences Between Linear and Logistic Regression Illustrated Graphically

The following graphic diagram clearly the differences between the estimated values of Y's behavior as it responds to the influences of the X (independent) variables in a linear regression as the various X variables themselves take on different values, on the one hand, and in a non-linear logistic regression on the other. (It's taken from this
online article --- and is an outstanding introduction of very concise and readable sort to logistic regression and logit modeling. It does presuppose a basic understanding of linear regression and statistical fundamentals.)



(i-a.) What Linear Regression Amounts To

As you can see, because of the way linear regression estimates the results of Y's behavior in response to the estimated values of the X independent variables, the linear regression function turns out to be a straight line. The technical name for this is SRF --- the sample regression function. (The sample, remember, is what can be drawn either randomly or in non-probabilistic ways from a larger population of data.) More precisely, in linear regression it's also called the least-squares line or function.

If you were to produce a scattergram of the individual observations (data-points) that a linear regression model analyzes --- which would show up as individual dots scattered above and below the straight line in the diagram here --- the least-square's line (function) would show these various data-points clustering closely around the line if, the key point here, the linear model were efficiently specified and tested. The result would look like this. (The diagrams are taken from an excellent introduction to econometrics by William S. Brown, Introducing Econometrics (West, 1991). For some reason, this unusually lucid survey with its praiseworthy concrete examples is out of print.)

Y = a + bX + e


where

Y is the quantitative dependent variable a is the coefficient on the constant or intercept term, b is the coefficient(s) on the independent variable(s), X is the independent variable(s), and e is the error term.

(i-b.) An Intercept or Constant Term

Note that there are two terms on the right side of the equation, "a" and "e, that are always present in a linear regression model, however pared down the number of it's estimating or independent variables. The first one is known as the constant or intercept.

The X estimating variable(s) calculate the slope of the linear regression function: how much the value of Y, the dependent variable varies as one unit of X increases. The intercept or constant estimates where the linear regression line --- more precisely, remember, the SRF or sample regression function --- crosses the Y axis: which is the same as saying what the value of Y is when X is 0.

Depending on your sample selection, the value of Y when X = 0 could, of course, be negative.

(i-c) As for the Error Term or Residual . . .

It too is always on the right side of a linear regression equation.

Even the simplest linear regression has an error term or residual, It expresses all the factors or influences on the behavior of Y that the model can't account for. Among these factors are random chance as well as unforeseen "disturbing conditions" such as the impact, say, of a war on GDP growth of the US. The only time there wouldn't be an error term is if someone specified a strict identity regression model: say, how does the behavior of X in inches affect the behavior or values of Y in centimeters. Needless to add, why anyone who's not anal-compulsive would waste time on such a definitional regressive model isn't clear.

Sometimes, too, a large error term's value reflects the presence of an important factor or influence on "Y", the outcome variable, that is always present, but can't yet be fathomed by existing social science theories. Over time, that could change as theoretical work improves.

For instance, the original neo-classical economic-growth model developed by the Nobel prize winner, Robert Solow of MIT, in the mid-1950s, turned out when tested statistically to always have a large residual or error term --- up to 50% of annual GDP growth; and for decades, growth theorists debated its interpretive meaning. By the mid-1970s or so, a lot of theorists started claiming that the large error term reflected technological progress, and some of them began trying to find ways to "endogenize" such technological progress by introducing it as an explicit estimating (independent) variable along with Solow's original two other estimating variables: the annual growth of capital stock and the annual growth of the labor force . . . the latter sometimes qualified in a further controversial effort to take into account not just the number of workers added to the economy each year, but the qualitative changes in their education and training, both of which would make them more productive.

Finally, in the mid-1980s, Paul Romer of Stanford developed an efficient and highly regarded new growth theory and regression model that found ways, he claimed, to proxy the qualitative improvements in technology. Needless to add, that led to a new theoretical debate among those who claimed that the original Solow model --- augmented by adding qualitative improvements to the labor force variable (such as years of education) --- sufficed, reducing the size of the original error variable to acceptable proportions without "endogenizing" technological progress itself. Others, especially younger growth theorists, opted for the Romer model. Generally, most of the debate seems to have favored the "New" or "Endogenous" growth theorists, but 20 years of econometric testing have not led to conclusive results . . . something that marks the nature of theoretical debates in the social sciences.


(i-d) The Error Term, understand, Is What Makes Any Regression Model Probablistic.

Meaning? Simply this: if there weren't an error term, the estimated results of a regression model deterministic --- the realm of strict causal laws: when "a" the intercept term and all the "bX" estimating variables are calculated, all of Y's behavior would invariably be accounted for and invariably the same. Statistical work of any sort, not just regression modeling, isn't like that. It is always probabilistic, not deterministic; and at most, any statistical prediction holds that the estimated outcome will occur with a certain level of probability. The error term reflects the failure of the estimated coefficients or parameter-values of "a" and the "bX" variables account for the dependent or outcome variable's behavior (value) with 100%.

In logistic regression, for reasons that you'll see in a few moments, the error term or residual is latent, but it can be made explicit and used to help interpret a logit model's estimates if the researcher using the model wants to.

 

A Bugged-Out Example of a Simple Linear Regresson Model, Eventually Made More Complex

In classic linear regression, the least-squares method is used to estimate the overall parameter-values(coefficients) of each independent variable in a linear model as they respond to the sample selection's observations or data-points. . . along with certain statistical tests of their statistical reliability and of model's overall estimating accuracy and efficiency. In the process, statistical inferences can or can't be legitimately made from the estimated results of the regression model's sample selection back to the larger population. (Note that the least-square method involves differential calculus. It estimates an overall value for "a" and "bX" that minimizes the sum of the squared errors . . . with those errors reflected, you'll recall, in a scattergram by the various data-points or observations falling either above or below the estimated sample linear function.

The closer those error terms cluster around the least-squares estimated straight line, the more efficient the linear regression model that's been specified. That said, don't worry. There's no point in entering into the math in this article, which takes the partial derivative with respect to "a" and to "bX", then solving the equations simultaneously. In any case, unlike when prof bug first learned regression techniques back in the Dark Ages, the software now does this for you automatically even on a pc.)


 

(1-a) A Bugged-Out Example

Consider a hypothetical example --- say, a linear regression model that tries to pin down the relationship between the varying size of police departments in various urban areas and the incidence of violent crime in those areas.

A pared-down model would have just one independent estimating variable --- say, the varying size of police department in urban areas --- and seek to determine whether the larger the size, the lower the incidence of violent crime in those areas . . . the latter, of course, the dependent or output variable. Assume there are 1700 medium-size and large urban areas in the US. A random sample drawn from this number (the total population, 1700) might be 100 urban areas, and the linear regression model would estimate the influence of the X variable (varying size of 100 urban police forces) on the incidence of crime in those cities. Once the estimates are arrived at, a variety of regression techniques would seek to determine if statistical inferences could then be made back to the entire 700 urban areas in the US.

In this case, the simple linear regression model would look like the equation set out earlier, with only one estimating or independent variable, bX . . . which stands for the varying size of police forces in the sample selection of 100 urban areas. The Y dependent variable --- the varying incidence of violent crime --- then behaves in certain ways (takes on certain values) in response to changes in each of bX's values. The terms here are taken from this Note in passing that in logistic regression, the error term is latent and can --- at the discretion of the researcher --- be made explicit. The reason for its latency? As you'll see momentarily in (iii.), the logit transformation of a non-linear equation incorporates all the probabilities in its estimating procedure --- maximum likelihood estimation. (A related estimating procedure turns the likelihood function into a "logged" likelihood function, often done to simplify the multiplication of probabilities.)

(1-b) The Need For More Estimating (Independent) Variables

In the hypothetical example of a pared-down linear regression model with one estimating variable --- the varying size of police forces in urban areas --- the resulting estimates would very likely turn out to be pretty inadequate. The incidence of violent crime, it goes without saying, doesn't depend just on the sizes of police departments . . . though note: a more complex model with other estimating variables would allow you to isolate the influence of police-size by controlling --- holding constant --- those other variables influencing its incidence, something done these days automatically by regression software. Translated into a scatter diagram, the various observations in our simple, pared-down linear model --- 100 in all --- wouldn't likely cluster closely to the estimated SRF, and the error term (e) would turn out to be large. Statistical tests like R-square would be medicore, and the coefficient of "sizes of police department" probably wouldn't pass a simple t-test to determine if it is statistically significant at the 5% level. (Which means that the estimated effects would be impossible to distinguish from random chance in statistical jargon.)

A much more efficient regression model, then, would have to include other influencing variables on the incidence of violent crime: say, if B1X1 were "the size of police forces", then B2X2 might be the "average age of the urban area", B3X3 might be the "per capita income of the area", and B4X4 would be the "percentage of African-Americans in the area" . . . all considered major influences on the incidence of crime in criminological studies. The resulting model would look like this:

Y = a + b1X1 + b2X2 + b3X3 + b4X4 + e . . . where the lower'case b is used as the estimator for the value of the coefficient of each X independent variable when run on the selected simple drawn from the population of 700 urban areas in the US.

(1-c) What The New Estimated Results Would Likely Be in the "Multiple Regression Model"

Remember, this is strictly a hypothetical example, and prof bug hasn't specified the model and applied to crime data-bases. Almost certainly, though --- thanks to these three additional estimating variables --- the resulting scattergram would show the 100 observation points clustering much closer to the estimated linear regression line of the sample function (the SRF), and the resulting error term "e" would be much smaller. The newly specified model --- called multiple regression analysis --- would account for a much larger number of the observations . . . the various changes in the incidence of violent crime as the individual values of each of the estimating independent variables change along the X axis.

In estimating each coefficient or parameter of the four independent variables, observe that software automatically controls for the estimation: it holds all the other three variables constant as it estimates the changes in the value (behavior) of Y (the incidence of violent crime) as, say, b1X1 --- the size of police forces --- changes. Each coefficient can be tested statistically for its effectiveness with what's call a "t-test", though it's better to run what's called "goodness-of-fit" tests for the entire model: Rsquare (the coefficient of determination with values between 0 and 1 or -1) and anova --- the analysis of variance.. In linear regression --- unlike in logistic regression --- there is general agreement on what these tests should be.

 

All of which brings us to . . .

Logistic Regression and Logit Modeling or Analysis

Look again at the initial diagram in this discussion of regression modeling: it compares the estimated straight-line (linear) sample regression function (SRF) with the S-curve that the SRF takes for a logistic regression for a model like Richard Pape's. And remember, the reason why Pape has to specify a non-linear logistic regression model is that his dependent variable is strictly qualitative and binary --- whether suicide-terrorism occurs or not.

(1-a) How a Logit Model Is Estimated


This current selection, alas, is slightly technical. No help for it: linear regression in its classic form is fairly simple to grasp --- not that all linear regression, trust me, is that simple; just the opposite. Still, despite the technical nature of these comments, they're brief, so try to run your eye over them even if they don't make much sense to you.

Least-square estimates for generating the results of a linear regression can't, it goes without saying, apply to Pape's or any logit model. Instead, the logit model is estimated using "maximum likelihood procedures" to generate the values (coefficients) of the variable.

(1-b) Expressed mathematically, the Logit Model Uses . . .

a
natural log transformation that explicitly incorporates all the estimated probabilities . . . more precisely, the log odds ratio, or logit. The following equations and interpretations are taken from the excellent online article by John Whitehead , mentioned earlier, that sets out --- far more clearly than any article prof bug has seen --- a very useful example of logistic regression basics. The math is kept to a bare minimum, and the article is a gem of concise and nicely illustrated writing. It does presuppose that you know the basics of linear regression. )

The "logit" model, Whitehead notes, solves these problems:

"The "logit" model solves these problems: ln[p/(1-p)] = a + BX + e or

[p/(1-p)] = exp(a + BX + e)


where: ln is the natural logarithm, logexp, where exp=2.71828… p is the probability that the event Y occurs, p(Y=1) p/(1-p) is the "odds ratio" ln[p/(1-p)] is the log odds ratio, or "logit" all other components of the model are the same.

"The logistic regression model is simply a non-linear transformation of the linear regression. The "logistic" distribution is an S-shaped distribution function which is similar to the standard-normal distribution (which results in a probit regression model) but easier to work with in most applications (the probabilities are easier to calculate). The logit distribution constrains the estimated probabilities to lie between 0 and 1. "

For instance, the estimated probability is:

p = 1/[1 + exp(-a - BX)]

With this functional form:

if you let a + BX =0, then p = .50 as -a - BX gets really big, p approaches 1 as -a - BX gets really small, p approaches 0. "


 

What Can We Conclude About Logit Analysis, Which, Remember, Pape Uses?

Basically this: logistic regression is far different from linear regression, and it is more complex and harder to interpret efficiently. It is also much newer --- barely 20 or 30 years old --- than linear regression and its variants.

What follows should be predictable. There are lots of differences and controversies that envelope the proper interpretive procedures and statistical tests of any resulting estimates in logit analysis. True enough, the software running logistic regression --- usually SAS, SPSS, or Stata --- has made the routine specification of a model and its testing much easier over the years. On the other hand, a lot of low-grade cookbook work, little else, is done by researchers using logistic regression software. Hordes of them don't care about the theoretical base of what they're doing, let alone the various controversies surrounding the use and interpretation of it. Worse yet, the sample selection --- their data-sets --- are frequently questionable as well.

In the upshot, all sorts of dubious or outrightly silly claims for their uninspired, churned-out logit models are made.


 

So where are we? Well, as Bugs Bunny used to say at the end of his cartoons ( Bugs no relative, by the way, of Prof Bug --- which is a shame), we're here . . .

"That's All For Now, Folks!" on What Linear Regression and Logistic Regression Amounts To.

For those of you whose interest in regression modeling has been whetted by these comments, the following little book is a gem of concise, lucid analysis using simple examples that are followed throughout the book as linear regression is explained and illustrated --- such as what accounts for baseball players' salaries, or what explains why incumbent Presidents or their immediately successors (say, Al Gore in 2000) are elected or not (expressed quantitatively, by the way, as the percentage of the total two-party vote in elections during the 20th century), of what accounts for the huge variation across US states of abortion rates. To the extent any statistical book is fun to read, this comes as close as any prof bug knows, and it doesn't presuppose any statistical knowledge beyond knowing what an average is, plus some very simple algebra and geometry. Entitled Regression Basics, it's by Leo H. Kahane and published as a fairly recent paperback by Sage.

As for basic statistics, start with this online site:

The outstanding introductory survey by John Whitehead of logistic regression is found here. Be sure to load up the powerpoint illustrations that are linked to at the bottom.

 

PART THREE:
PAPE'S LOGIT MODEL SET OUT AND ASSESSED


Part Three, note, is continued in the next buggy article.