R in the CPB and the role of Open Source in Promoting Transparency and Austerity
Thursday, April 12, 2012 at 09:37PM The Revolution blog links to an O'Reilly interview with two CIO's from the Consumer Protection Bureau. The gist of the interview is "Open Source is Great, we are using it for everything; R and Big Data are the next hot thing, et cetera". I don't mean to belittle those points, as I mostly agree them, but they are covered well in both the Revolution post and the original interview. However what is explicitly missing from the interview (but implied by the tone) are how Open Source technologies directly answer two recent and recurring demands of government: Transparency and Austerity. At at a time when most of the electorate is calling for more transparent government (and at least in public, government seems to agree) and maybe 60% of the electorate wants less government spending open source is an obvious answer. Open source tools do not directly make government more transparent (I think legislation, institutional/cultural change are the real drivers), but at least partially addresses it: If government reports and official statistics are produced in R, for example, and the code posted on git (as is suggested by the interview) then that allows the numeric public to cross check the nitty gritty details of whatever assumptions, models, et cetera go into those figures.
The more obvious benefit is austerity. Especially with regards to spatial software, contracts with private vendors are huge (but I have no idea what portion of government spending goes into it). Switching to open source, at least during times of austerity would not only save money but also force some competition into the relative monopolies held by ESRI, SAS, and others. When I have some more time in the future I'll try to back those assertions regarding government spending on private vendor software contracts with some hard data.
open source in
R 
Reader Comments (1)
Be careful what you wish for. Right now, it takes just a wee bit of digging to discover that nearly all of the economic data published weekly/monthly/quarterly/annually is derived from sample surveys. Just reading the footnotes, never mind running a new routine against some data, often reveals that the null hypothesis should be accepted.
Journalists and politicians have relied on these numbers for decades (and never, that I have read, explicitly explain the sampling regime); revealing that the Emperor's New Clothes don't mean anything could be used to incite insurrection. Never mind that said insurrection would be another AstroTurf exercise by the Right.
There is little, if any, Big Data in the policy arena, save for the decennial census. It's a red herring.