How algorithm can kill…
So apparently the Australian Government has been buying it’s software from Cyberdyne Systems, or at least you’d be forgiven for thinking so given the brutal treatment Centerlink’s autonomous debt recovery software has been handing out to welfare recipients who ‘it’ believes have been rorting the system. Yep, you heard right it’s a completely automated compliance operation (well at least the issuing part).
As it turns out the system is designed, if that’s the right word, to fail deadly rather than fail safe due to basic algorithmic errors (1) in calculating weekly income coupled with a design assumption that inconsistencies in compared data sets are hard evidence of malfeasance. The reality is however that integrating big data sets is almost always a messy and error prone process (2) so believing you can run hard matching logic over this and then get a high quality output is a wilfully obtuse position to take (3). Evidence of the false positive rate that is an inevitable consequence comes in the form of an internal Centrelink review that found that out of the hundreds of recovery claims reviewed only 20 or so were actually valid.
Of course that doesn’t matter because the governments currently punching out 20,000 recovery notices a week and as long as people pay up, why should they (or we) care? Letting the system run in the lead up to Christmas, that least stressful period of the year, is of course brilliant, combining that with the threat of jail, then placing the onus of proof the accused, while simultaneously demanding a degree of evidential proof that most people cannot provide is pure bureaucratic genius really (4).
But this is not over by a long shot, harvesting money from a group which contains people who are least able to deal with being wrongly accused will likely result in casualties in the real world (5). No wonder that the Australian Privacy Foundation called this a ‘cluster-fuck’.
This has been another Kafka-esque moment from the world of big data.
1. I’d call that gross negligence actually, and I’m wondering how anything that obvious got through the software requirements review process. And why is it OK to still use it?
2. Mainly because a lot of the data-sets are themselves incomplete a in error. Of course this won’t deter the disciples of ‘big data’ such as the head of the ABS. GIGO does not exist in their world god bless ’em.
3. One does wonder who green lighted this project, where the adult leadership was in the Department at the time and why they didn’t bother to learn from the mistakes of others (see 6).
4. I’m still wondering at the Group-think within the department that says it’s OK to field a system that you know generates obviously wrong results.
5. The power of bad data is not new, back in the 1990s a blood bank’s management software used to track donated blood erroneously identified a blood sample as being HIV positive. When the donor was informed they went home and committed suicide.
6. Recently the Michigan unemployment insurance agency deployed it’s version of an automated compliance system only to find that 93% of it’s recovery notices were false.