Posted on 10 Comments

Studying Programs is Hard to Do: Why It’s Difficult to Write a Compelling Evaluation

Evaluation sections in proposals are both easy and hard to write, depending on your perspective, because of their estranged relationship with the real world. The problem boils down to this: it is fiendishly difficult and expensive to run evaluations that will genuinely demonstrate a program’s efficacy. Yet RFPs act as though the 5 – 20% most grant budgets usually reserved for evaluations should be sufficient to run a genuine evaluation process. Novice grant writers who understand statistics and the difficulties of teasing apart correlation and causation but also realize they need to tell a compelling story in order to have a chance at being funded are often stumped at this conundrum.

We’ve discussed the issue before. In Reading Difficult RFPs and Links for 3-23-08, we said:

* In a Giving Carnival post, we discussed why people give and firmly answered, “I don’t know.” Now the New York Times expends thousands of words in an entire issue devoted to giving and basically answers “we don’t know either.” An article on measuring outcomes is also worth reading, although the writer appeared not to have read our post on the inherent problems in evaluations.

That last link is to an entire post on one aspect of the problem. Now, The Chronicle of Higher Education reports (see a free link here) that the Department of Education has cancelled a study to track whether Upward Bound works.* A quote:

But the evaluation, which required grantees to recruit twice as many students to their program as normal and assign half of them to a control group, was unpopular from the start […] Critics, led by the Council for Opportunity in Education, a lobbying group for the federal TRIO programs for disadvantaged students, said it was unethical, even immoral, of the department to require programs to actively recruit students into programs and then deny them services.

“They are treating kids as widgets,” Arnold L. Mitchem, the council’s president, told The Chronicle last summer. “These are low-income, working-class children that have value, they’re not just numbers.”

He likened the study to the infamous Tuskegee syphilis experiments, in which the government withheld treatment from 399 black men in the late stages of syphilis so that scientists could study the ravages of the disease.

But Larry Oxendine, the former director of the TRIO programs who started the study, says he was simply trying to get the program focused on students it was created to serve. He conceived of the evaluation after a longitudinal study by Mathematica Policy Research Inc., a nonpartisan social-policy-research firm, found that most students who participated in Upward Bound were no more likely to attend college than students who did not. The only students who seemed to truly benefit from the program were those who had low expectations of attending college before they enrolled.

Notice, by the way, Mitchem’s ludicrous comparison of evaluating a program with the Tuskeegee experiment: one would divide a group into those who receive afterschool services that may or may not be effective with a control group that wouldn’t be able to receive services with equivalent funding levels anyway. The other cruelly denied basic medical care on the basis of race. The two examples are so different in magnitude and scope as to make him appear disingenuous.

Still, the point is that our friends at the Department of Education don’t have the guts or suction to make sure the program it’s spent billions of dollars on actually works. Yet RFPs constantly ask for information on how programs will be evaluated to ensure their effectiveness. The gold standard for doing this is to do exactly what the Department of Education wants: take a large group, randomly split it in two, give one services and one nothing, track both, and see if there’s a significance divergence between them. But doing so is incredibly expensive and difficult. These two factors lead to a distinction between what Isaac calls the “proposal world” and the “real world.”

In the proposal world, the grant writer states that data will be carefully tracked and maintained, participants followed long after the project ends, and continuous improvements made to ensure midcourse corrections in programs when necessary. You don’t necessarily need to say you’re going to have a control group, but you should be able to state the difference between process and outcome objectives, as Isaac writes about here. You should also say that you’re going to compare the group that receives services with the general population. If you’re going to provide the ever-popular afterschool program, you should say, for example, that you’ll compare the graduation rate of those who receive services with those who don’t, for example, as one of your outcome measures. This is a deceptive measure, however, because those who are cognizant enough to sign up for services probably also have other things going their way, which is sometimes known as the “opt-in problem:” those who are likely to present for services are likely to be those who need them the least. This, however, is the sort of problem you shouldn’t mention in your evaluation section because doing so will make you look bad, and the reviewers of applications aren’t likely to understand this issue anyway.

In the real world of grants implementation, evaluations, if they are done at all, usually bear little resemblance to the evaluation section of the proposal, leading to vague outcome analysis. Since agencies want to get funded again, it is rare that an evaluation study of grant-funded human services programs will say more less, “the money was wasted.” Rather, most real-world evaluations will say something like, “the program was a success, but we could sure use more money to maintain or expand it.” Hence, the reluctance of someone like Mr. Mitchem to see a rigorous evaluation of Upward Bound—better to keep funding the program with the assumption it probably doesn’t hurt kids and might actually help a few.

The funny thing about this evaluation hoopla is that even as one section of the government realizes the futility of its efforts to provide a real evaluation, another ramps up. The National Endowment for the Arts (NEA) is offering at least $250,000 for its Improving the Assessment of Student Learning in the Arts (warning: .pdf link) program. As subscribers learn, the program offers “[g]rants to collect and analyze information on current practices and trends in the assessment of K-12 student learning in the arts and to identify models that might be most effective in various learning environments.” Good luck: you’re going to run into the inherent problems of evaluations and the inherent problems of people like Mr. Mitchem. Between them, I doubt any effective evaluations will actually occur—which is the same thing that (doesn’t) happen in most grant programs.


* Upward Bound is one of several so-called “TRIO Programs” that seek to help low-income, minority and/or first generation students complete post-secondary education. It’s been around for about 30 years, and (shameless plug here) yes, Seliger + Associates has written a number of funded TRIO grants with stunningly complex evaluation sections.

Leave a Reply

Your email address will not be published. Required fields are marked *