Tag Archives: data research

The unsolvable standardized data problem and the needs assessment monster

Needs assessments tend to come in two flavors: one basically instructs the applicant to “Describe the target area and its needs,” and the applicant chooses whatever data it can come up with. For most applicants that’ll be some combination of Census data, local Consolidated Plan, data gathered by the applicant in the course of providing services, news stories and articles, and whatever else they can scavenge. Some areas have well-known local data sources; Los Angles County, for example, is divided into eight Service Planning Areas (SPAs), and the County and United Way provide most data relevant to grant writers by SPA.

The upside to this system is that applicants can use whatever data makes the service area look worse (looking worse is better because it indicates greater need). The downside is that funders will get a heterogeneous mix of data that frequently can’t be compared from proposal to proposal. And since no one has the time or energy to audit or check the data, applicants can easily fudge the numbers.

High school dropout rates are a great example of the vagaries in data work: definitions of what constitutes a high school dropout vary from district to district, and many districts have strong financial incentives to avoid calling any particular student a “dropout.” The GED situation in the U.S. makes dropout statistics even harder to understand and compare; if a student drops out at age 16 and gets a GED at 18 is he a dropout or a high school graduate? The mobility of many high-school age students makes it harder still, as does the advent of charter schools, on-line instruction and the decline of the neighborhood school in favor of open enrollment policies. There is no universal way to measure this seemingly simple number.*

The alternative to the “do whatever” system is for the funder to say: You must use System X in manner Y. The funder gives the applicant a specific source and says, “Use this source to calculate the relevant information.” For example, the last round of YouthBuild funding required the precise Census topic and table name for employment statistics. Every applicant had to use “S2301 EMPLOYMENT STATUS” and “S1701 POVERTY STATUS IN THE PAST 12 MONTHS,” per page 38 of the SGA.

The SGA writers forgot, however, that not every piece of Census data is available (or accurate) for every jurisdiction. Since I’ve done too much data work for too many places, I’ve become very familiar with the “(X)” in American Factfinder2 tables—which indicates that the requested data is not available.

In the case of YouthBuild, the SGA also specifies that dropout data must be gathered using a site called Edweek. But dropout data can’t really be standardized for the reasons that I only began to describe in the third paragraph of this post (I stopped to make sure that you don’t kill yourself from boredom, which would leave a gory mess for someone else to clean up). As local jurisdictions experiment with charter schools and online education, the data in sources like Edweek is only going to become more confusing—and less accurate.

If a YouthBuild proposal loses a few need points because of unavailable or unreliable data sources, or data sources that miss particular jurisdictions (as Edweek does) it probably won’t be funded, since an applicant needs almost a perfect score to get a YouthBuild grant. We should know, as we’ve written at least two dozen funded YouthBuild proposals over the years.

Standardized metrics from funders aren’t always good, and some people will get screwed if their projects don’t fit into a simple jurisdiction or if their jurisdiction doesn’t collect data in the same way as another jurisdiction.

As often happens at the juncture between the grant world and the real world, there isn’t an ideal way around this problem. From the perspective of funders, uniform data requirements give an illusion of fairness and equality. From the perspective of applicants trapped by particular reporting requirements, there may not be a good way to resolve the problem.

Applicants can try contacting the program officer, but that’s usually a waste of time: the program officer will just repeat the language of the RFP back to the applicant and tell the applicant to use its best judgment.

The optimal way to deal with the problem is probably to explain the situation in the proposal and offer alternative data. That might not work. Sometimes applicants just get screwed, and not in the way most people like to get screwed, and there’s little to be done about it.


* About 15 years ago, Isaac actually talked to the demographer who worked at the Department of Education on dropout data. This was in the pre-Internet days, and he just happened to get the guy who works on this stuff after multiple phone transfers. He explained why true, comprehensive dropout data is impossible to gather nationally, and some of his explanations have made it to this blog post.

No one ever talks to people who do stuff like this, and when they find an interested party they’re often eager to chat about the details of their work.