Last year, researchers published a study in which ten different job training programs were evaluated through randomised trials – the way medical researchers test new drugs.
For each program, participants were randomly allocated by the toss of a coin. Heads, they received job training. Tails, they were assigned to a control group.
The study found that just one program – yearUp – had a measurable impact on earnings over the medium term.
The good news is that yearUp increased long‑term earnings by over US$7,000 per year. But the bad news is that the other nine programs didn’t deliver: perhaps because people didn’t complete the training, maybe because employers didn’t value it, or possibly because they weren’t able to find a suitable job.
The example shows that we need rigorous evaluation not because program designers are foolish or careless, but because many problems that government confronts are really, really difficult.
Indeed, designing an effective job training program seems to be as tough as making an effective drug. For every ten pharmaceutical treatments that enter clinical trials, only one makes it on to the market. Nine out of ten medical drugs that looked promising in the laboratory fail to make it through clinical trials and on to the market.
Just as we do in health, social policy experts need to be honest about removing or redesigning the ineffective programs so funding can be directed to others, like yearUp, that pass the test.
Alas, too many evaluations today compare those who choose to participate with those who choose not to participate. The problem with that strategy is that we tend to learn more about their choices than the program itself. The kinds of people who choose job training are likely to be different from the sorts of people who avoid it. Simply comparing participants with non‑participants isn’t a good way to evaluate a program.
Perhaps this all sounds a bit abstract. So to see the danger of using observational studies rather than randomised trials, pull up a bar stool while I tell you about the latest alcohol research.
Using observational data, many health researchers had noticed that moderate alcohol drinkers tended to be healthier than non‑drinkers or heavy drinkers. This led many doctors to advise their patients that a drink a day might be good for your health.
Yet the latest meta‑analyses, published in the Journal of the American Medical Association, now conclude that this was a selection effect. In some studies, the population of non‑drinkers included former alcoholics who have gone sober. Compared with non‑drinkers, moderate drinkers are healthier on many dimensions, including weight, exercise and diet. Studies that use random differences in genetic predisposition to alcohol find no evidence that moderate drinking is good for your health. A daily alcoholic beverage isn’t the worst thing you can do, but it’s not extending your life.
The problem extends to just about every study you’ve ever read that compares outcomes for people who choose to consume one kind of food or beverage with those who make different consumption choices. Health writers Peter Attia and Bill Gifford point out that ‘our food choices and eating habits are unfathomably complex’, so observational studies are almost always ‘hopelessly confounded’.
More reliable results come from randomised nutrition studies. These require volunteers to live in a dormitory‑style setting, where their diets are randomly changed from week to week. Nutritional randomised trials are costlier than nutritional epidemiology, but they have one big advantage: we can believe the results. They inform us about causal impacts, not mere correlations.
Using that insight, we’ve established the Australian Centre for Evaluation in the heart of government. The Australian Centre for Evaluation will address a problem that a slew of independent reports has identified: evaluation in the federal government tends to be low‑quality or non‑existent.
A core role for the Australian Centre for Evaluation will be to champion randomised trials and other high‑quality impact evaluations. It will partner with government agencies to initiate a small number of high‑quality impact evaluations each year.
An example of the kind of trial that might prove fruitful is the Victorian Government’s Healthy ³Ô¹ÏÍøÕ¾s Program, targeted at low‑income people. An evaluation found that spending a few thousand dollars on insulation and better heaters led to reduced use of the health care system. People who received the energy efficiency upgrade spent 43 minutes a day less in a cold house. Within three years, taxpayers had gotten their money back, via savings in the government’s health budget.
Good evaluation will help identify effective programs – like yearUp and the Healthy ³Ô¹ÏÍøÕ¾s Program. And it will save taxpayers money by identifying ineffective programs, so we can end them.
The Australian Centre for Evaluation isn’t ideological, it’s practical. The more we can figure out what works, the better we can make government work for everyone.