22 April 2017

Reinforcement learning and the public sector

Will Knight writes about reinforcement learning:
Reinforcement learning is a way of making a computer learn through experience to make a series of decisions that yield positive outcomes—even without any prior knowledge of how its actions will affect its immediate environment. A software-based tutor, for example, would alter its activities in response to how students perform on tests after using it. Reinforcement Learning, Will Knight, 'MIT Technology Review', March/April
It's the same approach practised by the computer called DeepMind, which 'mastered the impossibly complex board game Go and beat one of the best human players in the world in a high-profile match last year.' It's also, at least in theory, the basis of our economic system: the learning process that is supposed to go on through the creative destruction of business enterprises that fail to adapt to changing circumstances. As applied to driverless cars, 'the software governing the cars’ behavior wasn’t programmed in the conventional sense at all.' The software learns through trial and error.

Large corporations and their friends in government have seen to it that competitive markets play a lesser role than they should in determining our economic prospects. Corporations succeed today not so much through competing in markets, but more by manipulating the legislative and regulatory environment, and participating in effective cartels with other powerful players, both private- and public-sector. But the principle - that of survival of the most adapted, and the most quick to adapt - has generated enormous material wealth, and continues to operate, though more to the benefit of the already-wealthy than to ordinary people.

With its record of success, why isn't reinforcement learning, or competition to find the best approaches, allowed to operate in the public sector. That is, why don't we allow competition to solve our social problems? It's partly for  historical reasons. When governments began to take an interest in the welfare of ordinary people, solutions to their basic problems were fairly easy to identify. Requirements for things like sanitation, elementary education, a police force, fire engines and hospitals were - and are - difficult to argue against. But as society has grown more complex so too have our social and environmental problems. How, for example, do we go about reducing crime, improving our (already relatively high) levels of physical health, improving our mental well-being, or ending war? No obvious solutions leap to mind.

Which is where Social Policy Bonds could enter the picture. Rather than let public-sector organisations have a monopoly on trying to deal with these problems, a bond regime would, in effect, contract out finding the best solutions to the private sector. A goal such as eliminating war is going to need the exploration, deployment and refinement of a multitude of potential approaches. More than that, though, it's going to need to reward the most effective of these approaches and to terminate the useless ones. There are well-meaning organisations working to end war, but the people working for them are not rewarded for their success. There are no built-in incentives for the organisations to find optimal solutions, nor for inefficient organisations to be dissolved if their efforts prove futile or counterproductive. The result is that the challenges humanity faces, including environmental calamities or nuclear proliferation, are nowhere near being effectively met. But, unlike computers learning how to drive cars, or businesses operating in competitive markets, our failed approaches and ineffectual organisations aren't terminated. Sometimes, in fact, our politicians pump even more money into them.

Social Policy Bonds would change that. They would channel the market's incentives and efficiencies into the discovery and implementation of diverse, adaptive and, above all, efficient approaches to our social and environmental problems, including those, like war, that many of us have concluded have no solution. Our current system places responsibility for the solution of our problems in the hands of large, usually monopolistic, organisations that face little competition and have no incentive to try diverse approaches. There's certainly no reinforcement learning. It is because of its ability to stimulate diverse, adaptive approaches that a Social Policy Bond regime could succeed where existing policies have failed.

15 April 2017

What are governments for?

George Monbiot quotes from Kate Raworth's Doughnut Economics, saying that:
[E]conomics in the 20th Century “lost the desire to articulate its goals.” It aspired to be a science of human behaviour: a science based on a deeply flawed portrait of humanity. The dominant model – “rational economic man”, self-interested, isolated, calculating – says more about the nature of economists than it does about other humans. The loss of an explicit objective allowed the discipline to be captured by a proxy goal: endless growth. Circle of life, George Monbiot, 13 April
I've inveighed for years against the de facto goal of governments everywhere which, in the absence of clear, explicit goals, has become economic growth, expressed as the rate of increase of Gross Domestic Product, or GDP per capita. The grievous effects of this mistargeting on income and wealth distribution, the environment, and much else, are now becoming apparent. More recently, I've asked What are economists for? Governments as well as economists have lost their way. As Raworth says, they do not think in terms of explicit goals that are meaningful to ordinary people. They are pre-occupied with process and complexity:
The Dodd-Frank bill, like Obamacare, is tyranny by complexity. Consider the Glass-Steagall Act, at 37 pages in length, and the 2,319-page monstrosity of the Dodd-Frank Wall Street Reform and Consumer Protection Act. Charles Hugh Smith, 6 April
The only people who follow and understand our policymaking system are those politicians, bureaucrats, academics, lawyers and lobbyists who receive monetary rewards, sometimes vast, for doing so. The 'rational economic man' just happens to be the person who, in our sad attempts to purchase things that used to be supplied by the commons, maximises economic activity: that is, GDP. I think most people now see the flaws in targeting GDP as if it were an end in itself, but we are less united over what to do about it. My suggestion is that we start to express our policy goals in terms of broad, verifiable, explicit outcomes that are meaningful to ordinary people. Social Policy Bonds could do this, and inject the market's incentives and efficiencies into the achievement of those outcomes.

Under a Social Policy Bond regime, governments could still do what they are good at: raising revenue to achieve our social and environmental goals. And, though it would be a departure for most of them, they could learn to articulate these goals, as expressed and debated by the people they are supposed to represent. They could even, through government-financed bodies, help achieve these goals, but only if, through the Social Policy Bond mechanism, bondholders have confidence in their ability to do so more efficiently than other investors in the bonds. The current model which as Raworth points out, is dominant - that of 'rational economic man' - assumes and entrenches a paradigm that undervalues and often conflicts with community, the environment, small enterprises, and ordinary people's mental well-being. The short-term interests, as measured by accountants, of large organisations, both public- and private-sector, are privileged at the expense of the the rest of us. Social Policy Bonds would change that, right from the start, by posing the question that we cannot continue to evade: what are governments for?

06 April 2017

The European Union: just like every other organisation

From Nigel Farage's recent speech to the European Union Parliament:
Eighty-five percent of the global economy is outside the EU. if you wish to have no deal [with the UK] it is not us that will be hurt. ... A return to tariffs will risk the jobs of hundreds of thousands of people living in the EU. What you are saying is that you want to put the interests of the European Union above that of your citizens and your companies. Nigel Farage, 5 April
 Earlier in the same speech Mr Farage also pointed out that in 1973 the UK voted to stay in the European Economic Community; not a political union. The speech neatly illustrates two principles which I deem axiomatic:
  • Every organisation, be it a church, trade union, university, government or whatever, will always seek to overplay its hand. 
  • Every organisation will, sooner or later, forget its founding ideals and its stated objectives, and devote its energies to self-perpetuation.
In the private sector there is, or should be, the discipline of the market and competition, though we increasingly see larger companies winning out over small businesses not through fair compeition but via manipulation or subversion of the market.

Social Policy Bonds would lead to a new type of organisation: one whose goals would exactly coincide with those of society. That's one feature that would differentiate them from all other organisations in history. The other is that the goals of the organisation would either themselves be, or be inextricably linked to, society's well-being. Every activity of a coalition of holders of Social Policy Bonds would be undertaken with exactly one objective in mind: to maximise the efficiency with which society's targeted social or environmental goal is achieved. The activities, structure and composition of the organisation would be entirely subordinated to that one objective.

The difference between a Social Policy Bond regime, and any other current policymaking system, is stark. Under a bond regime governments would do what they do best (democratic governments anyway): that is, articulate society's goals and raise the revenue for their achievement. Then they'd, in effect, contract out the achievement of these goals to new organisations whose interests are exactly those of society. Their focus, and their striving for efficiency, are built into the Social Policy Bond mechanism. All society would benefit.

My reasons for voting for Brexit are given here. For more on Social Policy Bonds see the overview papers linked to here.