Stanford HAI 2019 Fall Conference – AI in Government

– All right, welcome. I’m Dan Ho. I’m a professor at the law school here also with affiliations with the political science department and CPR and with HAI. And we heard this morning from a really interesting discussion between Eric Schmidt and Mori Chisaki about regulating big tech. And I think this panel builds on that discussion really nicely. The topic we’re gonna tackle
in this breakout session is: AI In Government. And we’re thinking of that really as having two constituent components. One is, this panel’s gonna provide a little bit of a better
descriptive understanding of how AI is being adopted
in the public sector. And then try to get a little
bit closer to the ground on potential pathways for
governance in this space. And we have a just terrific panel here to tackle these two topics. The structure is gonna be first of all have each of our four panelists start off with brief presentations,
no more than 10 minutes followed by sort of Q and A specific to those presentations. And we’ll limit those also to 10 minutes. And then we’ll open it up for a more general discussion
between the panelists and then open it up to
the audience as well. It’s very much meant to be conversational, hopefully inspired by the
presentations at the tail end. And to make this simple we’re gonna follow the
alphabetical order of the program. And my role here is mainly
to get out of the way. So, let me give you brief
introductions of the speakers who are gonna be tackling this topic. Our first speaker will be David Engstrom who is a professor of law at the law school here at Stanford. He’s also Associate Dean
for Strategic Initiatives and teaches a wide range of subjects from civil procedure, administrative
law in Federal courts. He received his law degree from Stanford, his masters from Oxford and a PhD in political science from Yale. The next panelist will be
Sharon Bradford Franklin who is the Policy Director for New America’s Open
Technology Institute. There she’s Co-Director of
the Cyber Security Initiative. And has a really wide range of experience most relevant here. She brings a wealth of knowledge on cyber security, freedom of expression, government surveillance,
privacy, transparency and platform accountability. And prior to joining OTI Sharon
served as Executive Director of the Privacy and Civil
Liberties Oversight Board an independent Federal agency that reviews counter terrorism programs to ensure that they include
appropriate safeguards for privacy and civil liberties. And she received her law
degree from the Yale Law School and clerked for Judge Jane Roth. Our third panelist is Peter Lohan. Who is Professor of Political Science at the University of Toronto. He’s a fellow with the
Public Policy Forum. We were pleased to have him last year as a fellow at CASBS here
on the hill at Stanford. He directs the Policy Elections
and Representation Lab at the University of Toronto. And was formerly the Director of the School of Public
Policy and Governance at the University of Toronto. And he’s done, he’ll
share, I think, with us some of his really interesting research on the role of technology
in improving governance and representation of public opinion about sort of the regulation of AI tools. Peter received his PhD from
the Universite de Montreal. Our last panelist is Brittny J. Saunders who is Deputy Commissioner
for Strategic Initiatives at the New York City
Commission on Human Rights. She is Co-Chair of New York’s Automated Decision Systems Task Force whose charge is to develop a process for understanding algorithms
used by the City of New York particularly when it comes to equity, fairness and accountability. And she previously worked
for the Office of the Mayor as Acting Counsel where she
was quite heavily involved in the city’s broadband equity efforts. And she has done a wide range of work pertaining to legislation
about discrimination in credit and criminal history
in the employment context which was done by the
Commission on Gender Equity. She previously worked for the
Center for Popular Democracy on immigrant rights and racial justice. And holds a law degree
from Harvard Law School. So, with that out of the way I think we’ll have David
Engrstrom as the first presenter. – So, I am gonna spend my 10 minutes here reporting some preliminary results of a project that Dan and I have actually been
co-leading here at Stanford along with our colleague Tino Cuellar and also NYU’s Cathy Sharkey. And the impetus for the project was our view that much of the debate over public sector use of AI has focused rightly on state and local use especially within the
criminal justice system. And so, that means things
like predictive policing and criminal risk assessment about which there’s been lots of debate. And the goal of our project
was to look at a different but we believe also
quite important frontier which is the use of AI by Federal agencies for civil regulatory purposes. And so, right now I’m just
gonna do a 10 minute overview of some of the findings of that project to try to prime our conversation
here this afternoon. So, I’ll start out at a high level of abstraction, go
all the way up to 20,000 feet and present just a few results from what we call, a canvas
of the administrative state and in particular the use
of AI by Federal agencies. And what we did is we assembled
a team of 25 students, 15 law students and 10
computer science PhDs. And then we turned them loose on the Federal administrative state. And we had them look across
roughly 150 departments, agencies and sub agencies and to try to surface as many AI use cases as they possibly could. And ultimately they found roughly 170. This has allowed us to ask and answer three kinds of questions. So, one type of question
you might ask then is: So, where is AI being used? And the answer is: Across the waterfront of government action. And this can give you a sense of that. So, this is a breakdown of
the different use types. Some of these are a little
pedestrian but important. So, use of various MI tools to aid in the procurement process or
to manage agency employees. Lots of regulatory
monitoring and analysis. But many of the use cases in here are actually quite consequential. And you can see AI moving
progressively closer to the coercive and redistributive heart of the Federal government. So, for instance, lots of enforcement at some very important
agencies, agencies like: The IRS or the Centers for
Medicare and Medicaid Services or the EPA or the SEC. – Dave, these percentages
are of all the cases not of 26% of all enforcement is. – That’s right, it’s of all the cases. Correct, thank you, Margaret. So, we’re also seeing
quite a bit, at least some, not quite a bit but some use
in the adjudicatory context. And that’s really important
’cause that means things like: Disability benefits and
also patent licenses. So, what’s a second type of
question we might wanna ask? This is maybe for the engineers. What types of techniques
are Federal agencies using? And here’s a sense of that. So, take 171 use cases, so, the 250 is sort of
general claims of automation. But we identified 171 that actually used some
type of machine learning. And we asked our engineers
to actually identify further what particular techniques
were being used. What we found is that the vast majority involve supervised learning of some kind where the agency is trying
to predict something such as: Facial recognition at the
customs and border protection, storm tracking by the National Oceanic and Atmospheric Administration or chatbot responses at HUD. We also found small instances
of unsupervised learning such as the topic modeling of complaints received by the Consumer
Financial Protection Bureau. So, this is the agency
that Elizabeth Warren helped to set up. Third type of question: Where
do these tools come from? And here’s a sense of that. Surprisingly most of them come from in-house technical capacity of the agency. Some, in addition, come
from, as you could imagine the contract or procurement space. And then some come from competitions, that’s what non-commercial
collaboration is. So, competitions, so,
these are competitions that are initiated by the agencies in which they provide essentially prizes and try to incentivize private sector provision of solutions. That’s just a taste of
our overview of AI use within the Federal administrative state. Where do we go from here? Let’s go all the way
down to the ground level and let me give you a couple
of very specific use cases and give you some details on those which I think will help
prime our conversation. So, start with enforcement. I already mentioned that several
agencies: The IRS, CMS, EPA are developing or using algorithmic tools to predict who is violating the laws and thus to fate focus
agency enforcement resources. But I wanna focus you in
on the SEC a little bit which is using multiple tools. So, some of what the SCC is
doing is quite predictable. They have a bunch of
structured transaction data. And so, they’re using ML models to try to predict who’s
engaged in insider trading. But some of their tools are
actually far more sophisticated. The SEC is also using natural
language processing, NLP to parse unstructured
narrative disclosures to predict which investment brokers might be violating the
Federal securities laws. Now, these tools raise all
sorts of interesting questions, some of them technical, they’re
interesting distributive and political questions. But we think that some
of the most interesting implications of these new
algorithmic enforcement tools involve the legal requirements of transparency and reason giving. And for the non lawyers in the room the puzzle is that despite the fact that enforcement sits at the heart of the course of power of the state the law generally
insulates agency decisions to enforce or not enforce
from judicial review. And the reason is that when agencies are searching for needles in haystacks and using these as part of
their enforcement duties we worry that their decisions aren’t actually very intelligible especially to generalist judges. And so, we just hive off
enforcement decision making from judicial review. Interestingly, these new
algorithmic enforcement tools can make that situation worse and better. Could make it worse because of the opacity of these algorithmic tools which can actually make
the agency’s decision more inscrutable than before. But it’s also possible
that it makes it better. So, these algorithmic tools, they actually encode the legal principles and also the agency’s priorities. And so, they could end up making enforcement decisions more intelligible and they could cause us to rethink that presumption that we don’t review agency enforcement decisions. Moreover, maybe these
algorithms qualify as rules under what’s called: Administrative law and therefore would have
to be subject to notice and comment review under
administrative law. The result is that, counterintuitively, algorithmic enforcement might, so long as we handle it
correctly as a legal matter might on net produce an
enforcement apparatus that is more accountable
in terms of who must face the course of power of
the state than at present. So, that might be a happy story so long as the law handles
it in a wise manner. What about a second case study. Let’s turn to adjudication for a moment. There are a number of
mass adjudicatory agencies within the Federal administrative state. And they face a couple of
very common challenges. One is inter-judge disparities
in decision making. So, this is a graph that shows you that within the Social Security Administration there are some administrative judges that grant disability benefits
five percent of the time and some that grant 95% of the time which means that something
other than the merits must be driving these decisions. Another common problem at these agencies are massive backlogs. This is the file room at the
Board of Veterans Appeals where it can take sometimes seven years for a veteran to get a decision. The Social Security Administration is developing some very innovative tools to try to deal with these problems. Some of these tools do triage. They try to cluster similar cases so that the administrative judges can move more equitably and efficiently through the body of case law. A second type of tool, is
called the Insight system. It uses a machine learning tool to catch errors in draft decisions as in this pop up that tells
the administrative judge to please evaluate her decision. And it leverages a manually
developed decision tree capturing 2,000 possible outcomes within a five step decision process. And therefore determines
whether the adjudicators have used the proper pathing. And so, it can actually
identify 30 types of problems within draft decisions. So, as with enforcement
there’s some interesting technical and legal challenges here that we can leave for Q and A. But we think that one
of the most interesting issues that this highlights is the challenge of
government capacity building and in particular the need for
embedded tailored expertise that marries law and technical capacity. And it just seems clear to us having done a sight visit with
SSA and having understood, now that we understand
better how this tool works that SSA couldn’t have possibly
designed and implemented an effective, fair and
legally compliant version of this insight tool without a detailed understanding of those 2,000 possible outcomes, of the regulatory backdrop and of the organizational culture of the administrative judges
who must actually use the tool. And so, in our interactions with SSA at least it’s clear that one
of the biggest challenges facing government in this space with regard to public sector
development of AI tools is the risk of a myopic focus on getting the next tool
through the procurement process rather than maintaining, building internal capacity to innovate and to be able to critically
evaluate those innovations. Okay. Let me close with one thought which is we have tried to think hard about how we might build a sensible
accountability structure around these kinds of tools. And we can talk during Q and A about what some of the
regulatory mechanisms would look like. One of the really interesting questions is whether some of the very
standard regulatory mechanisms might work quite well to build
that accountability structure around what these agencies are doing. But we are also trying to
develop some other possibilities. And so, one idea that we have is that perhaps we could
impose a requirement that these agencies engage
in something we would call, that we call: Perspective benchmarking. And here’s the idea, the idea is that agencies
that want to adopt an algorithmic governance
tool of some sort must reserve a random test sample and work up those cases in
the old school analog fashion. So, in the SEC context
the line level prosecutors could be required to fully
work up and investigate cases without the aid of risk scores, without the aid of
these algorithmic tools. In the Social Security
Administration context perhaps the Insight system
could be deactivated for a random holdout set of cases. And the idea is it would
provide a ready made comparison across the analog and new AI worlds that could actually
help us to get purchase on how these tools are performing. Already over my time so I’ll stop there. But hopefully I’ve primed the conversation at the Federal level at least
along those three dimensions. So, thank you.
(group clapping) – So, we have time for maybe eight minutes of kind of discussion and questions specific to this presentation. I’m happy to start off if
people are sort of gun shy. Maybe David could you give the audience a little bit more of a sense of some of the kind of cultural differences that the research team observed in terms of the willingness, openness by different government agencies to actually adopt these tools? One of the kind of things that came out at the discussion from
this morning, I think, between Susan and Eric was the fact that 80% to 90% of the time is really about the front end of building the data infrastructure, hiring the right people,
changing the organizational, sort of the organizational process. And I don’t know to what, you know, maybe you can speak to that a little bit as to the differences
between the types of agencies that are actually willing to like the SSA actually develop these kinds of tools as opposed to ones that are not. – Okay, great, sure. So, I can actually stick with the two case studies that I presented. So, the SEC and the SSA. I think that provides a nice contrast. The SEC is a very well heeled agency. It’s an agency that, as you know, regulates the financial services industry. And they have quite a bit
of technical capacity. And there is quite a bit
of entity level interest in developing these sorts of tools. And so, they described for us meetings every week, every couple of weeks in which all the different technologists within the agency gather
and compare notes. And so, there was a very strong innovation culture at the SEC. And the result is the SEC is way ahead of most other agencies in terms
of developing these tools. And as I showed some of those
tools are quite sophisticated. The NLP tool that parses
unstructured narrative disclosures is an impressive tool. So, contrast that with the
Social Security Administration which is an agency that has to move enormous amounts of product and you would think would
benefit greatly from automation. It has the tools that it has largely because of the
entrepreneurial efforts of a few employees and in particular an
appeals counsel judge, his picture was on the
slide, that’s Gerald Ray. And he had this idea that he could do this and that it could work. But he went about it in
a very strategic way. He hired lawyers who also
had tech sides to them. So, lawyers who could code. And he brought them in and he
put them on lawyerly duties. And he bided his time. And he waited until they had enough seniority within the agency that he could start to deploy them on more controversial projects. And so, he has, so Gerald Ray has been kind of the one man show at SSA in making a lot of this happen. So, that gives you a sense. And I think he’s run into a
fair few barriers along the way. We’ve met with one of his mentees who is, Gerald is now retired one of his mentees who’s
trying to drive this forward. And the sense is that there is some receptivity
to it within the agency but maybe not as much as you would guess given that it is one of these
mass adjudicatory agencies that’s so thoroughly laboring under those two concerns that
I put in front of you, this on the one hand the concern about inter judge disparities which you would think an automated process could help to narrow a bit. And then also these terrible backlogs that mean that people who are in some pretty desperate circumstances
in a lot of instances. So, people who are needing
disability benefits, you know, are made to
wait a very long time to get a determination as
to whether they’re eligible for those benefits. – I have two questions in the back. (Dan speaking) – Thank you, very much. So, you mentioned how the different Federal
agencies implement AI but I was just curious about Federal, these agency also regulate
each separate industries and–
– Mm hm. – They might have different approaches to regulating AI in the sectors. My understanding is that National Highway Transportation Safety
Agency had much more liberal and more promoted innovation in AI than the automobile industry. And then the FTA isn’t. So, I was wondering if you encountered any of those sentiments
while visiting agencies. And then, how would one
eventually think about if there’s different
approaches in regulating AI in different sectors like
these Federal agencies what’s the overarching role–
– Mm hm. (man speaking)
Right, okay. So, I guess everything I just said about the SEC and the SSA
and the different cultures that is surely also present in terms of agencies that are thinking about regulating AI out in the world we haven’t had so much
access to that I have to say because we have tried to
keep this project cabined and focused on government use of AI. In part that’s because we’re
experts in administrative law, we love thinking about how
bureaucracies do their work. And in part it was just
to keep it tractable. You can’t sort of do everything. Where we’ve run into it a little bit is this idea of internal technical capacity. And we think that’s so important. And I tried to hit on it
a bit in my presentation. But you can imagine that
building technical capacity can both fuel the development of usable, fair, accountable
governance tools. But that technical capacity
is also very important in terms of the agency’s ability to, to credibly regulate AI out in the world. So, NHTSA for instance is thinking hard about how it can build
its technical capacity because it knows with autonomous vehicles coming online at some point, we don’t know exactly when
they’ll come on at scale but that they know that at that point they’re gonna need to know quite a bit about the underlying technologies if they’re gonna be able to regulate them in an effective way. Maybe I’ll leave it at that and we’ll move to a second
question if that’s okay. – Yeah. – Thanks, David The quite striking thing was the 50% rate at which these are being
produced internally. I think linking that to your
last point about benchmarking I wonder if you can share with us the decisions that have
been taken about use. ‘Cause one of the challenges
inside a Federal agency with this internal capacity
that’s generating it is: What is the evaluation about, whether that use is appropriate, whether it’s achieving
the right objectives and is the quality that one needs, what outcomes are being optimized, what outcomes are being traded off. And so, I wonder if you can share a bit from some of these examples. You’ve got a set of, you’ve either got an innovation enthusiast at the top or an innovation enthusiast somewhere who’s generating some
new way of doing things. And the decision to actually
use and institutionalize that, on what grounds is that
decision being made? – Mm hm, mm hm. So, I do think that, so, I’m not sure. I’m trying to think of a good example where we’ve got real transparency over that kind of a decision. In general what we’ve found
is outside the SEC context a lot of the agencies are
developing these tools in very bespoke ways which is to say it takes a
very particular entrepreneur to put forward a tool. Ad generally, not generally
but often this means an economist who learns some
machine learning on the side, on his weekends or on her weekends and then comes up with a tool. And so, I wish we knew more
about once you have that tool what kind of resistance or what kind of questions get asked as
it moves up the chain. And I just don’t have a
great example for you there. I do think that this benchmarking idea, I think one of the reasons
we’re attracted to it is that we haven’t found
lots of rigorous validation in other types of testing of these tools. And I guess one reason why although there’re probably lots of reasons is that there’s like a basic catch 22 here which is that the same
budgetary shortfalls, the same hard budget constraints that lead agencies to need to develop these automation tools in the first place also prevent the kind of full
testing that you would want. We think benchmarking
is a great intervention for that reason. But it does impose a tax on innovation because if you do have these agencies that many of which are underfunded especially in the current
fiscal environment you worry that they can get
a tool just over the hump, just get it deployed but they don’t actually
have the resources they need to do some kind of full
validation or testing. And so, if there is an
external regulatory requirement and there’s lots a ways to structure this but one way to do it is to
make it a heard requirement, that does put a tax on the agency. And it means that we might
have beneficial tools that don’t come online at all as a result. So, I don’t know, that’s
just a couple of thoughts I wish I had a, you know, I wish we had a case study
of that adoption process that could give us some insight as to what I take to be the more general thrust of your question. – Thank you, David. – Great, thank y’all. Thanks, all. – So, I’m gonna talk about several uses of AI by government
that pose certain risks. And I just wanna start out by saying my point here is not that all uses of AI by government are bad. I don’t want to go there at all. There certainly are some beneficial uses. But when you’re talking
about use by government as opposed to use in the commercial space the government has the power
to take away your liberty and to present you with or deny services that can be potentially life saving or highly consequential to your life. So, the stakes are a bit
different in that context. And I wanna, right now I’m not doing
very well on my clicking. I’m used to a different kind
of laptop, bear with me. (man speaking) Okay, there I go, thank you. I’m trying to use the mouse. Sorry, right, the different industries and different laptops. Okay, thank you. So, I wanna talk today about
three illustrative categories of government use of AI that raise particular risks to individuals and talk a little bit about each one. And those are the use in terms of public benefits determinations, use in pre-trial risk assessments and the use in various
types of surveillance particularly facial
recognition technologies. And so, starting with the case of use of AI in public benefits. This is an area where
you really can be denied benefits that are a
matter of life and death and/or highly consequential
to your health. And there was a book that
came out in early 2018 by Virginia Eubanks. I don’t know, some of you
may be familiar with it, called: Automating Inequality. And she did a study where
she examined case studies in Indiana, in Pittsburgh
and in Los Angeles where government agencies
were relying on AI for a variety of
entitlement type decisions. And, for example, in her
examination of Indiana she looked at a system where, they set up a system to automate
decisions on entitlement to a variety of welfare benefits. And found that between 2006 and 2008 when they first launched the system Indiana denied more than
a million applications for food stamps, Medicaid
and cash benefits which amounted to a 54% increase
over the prior three years when they didn’t have
that system in place. And what she focuses on and this doesn’t mean,
again, all AI is bad but where you have a system that is relying on machine
learning algorithms where people don’t know what is it doing and it is a very rigid system it really compounds the
harms to people who are poor, who don’t have access to
the same kinds of resources both financial and access
to making complaints to a government official
that other people do. And it just compounds the problems that those individuals face. She found in particular that many errors were the result of inflexible rules or any deviation from the rigid
process that it had set up led to a determination that was called: Active refusal to cooperate, which automatically resulted
in denial of benefits. And in part she also found that the system was designed deliberately
to remove the human element of relationships between a case worker and somebody who would get benefits because that was seen as
an invitation to fraud and the consequences of
that where it really was compounded the difficulty for people to get access to benefits and to challenge erroneous decisions. And relatedly there is this problem with administrative law. I’m very intrigued by David’s
discussion in that context. In 2008, Danielle Citron, a law professor, had a work called:
Technological Due Process which was fairly influential in this area looking at how when
agencies decide entitlement to benefits using machine learning or what we often call black box algorithms they cannot provide individual due process in the way we think of. We think of meaningful notice and a meaningful opportunity to be heard and that they just cannot
have these challenges. So, I’m very intrigued by the suggestion of having your algorithm put out for notice and public comment rule making which would really directly
address that challenge. The other point she makes in her piece is when agencies delegate
decision making to algorithms this in many ways undermines
some of the rationales that we have for our administrative state where we say: Courts are
gonna defer to the expertise of an agency making the decisions if the agency instead
of having its in house, relying on its in house expertise is deferring to an algorithm
that they don’t understand it undermines that rationale. But some of the ways that people have pushed back against this have involved challenges
through litigation. And one example I have
up here on the slide is in the State of Michigan in 2013 they initiated a system to detect fraud in unemployment claims leading to almost 50,000
people being accused of fraud. There was a lawsuit filed in 2015. The evidence uncovered in the lawsuit ultimately showed that 90% of those determinations were inaccurate but the State of Michigan fortunately was reversing those findings
based on that lawsuit. So, the second category I wanna talk about is the use of AI in
pre-trial risk assessments. And this is a case
where we really can have real risks of perpetuating
bias in the system. The bias in our criminal justice
system in the United States sadly is well-documented and algorithmic tools can only be as good as the data that they’re based on. And so, there is a tendency in this area to really perpetuate and
exacerbate the biases that are found in our
criminal justice system. Most notably, a sorta easy example is that we know that arrest records are biased against people of color because of over policing
in those communities. And so, the training data
can perpetuate that bias. There was a study in 2016, got some attention by
researchers at ProPublica showing these kinds of racial disparities in forecasting who would
be repeat offenders and should therefore at the pre-trial risk assessment stage be incarcerated. And compounding this here is an issue related to what I talked about in the public benefits context where you have a lack of transparency that really restricts defendants’ ability to challenge these assessments. In this context in terms of pushing back in July of 2018 a broad
coalition of more than a hundred civil rights, digital justice and community based organizations including my own organization,
the Open Technology Institute released a shared statement
of civil rights concerns regarding the use of pre-trial
risk assessment instruments. It highlighted some of these risks that I’ve been talking about about perpetuating patterns
of racial discrimination. And while criminal justice reform is not an issue that I
work on that much directly and my organization joined it some of the allies and people
who are leading this charge really have been using
this shared statement to engage in conversations
with some of the jurisdictions who are employing these tools and having some success on them. The final category I
wanna highlight here is the risks of using AI for surveillance. And, in particular, to
focusing on facial recognition. Here, too, we have a problem of detecting, perpetuating and exacerbating problems within accuracy and bias. We heard Joy’s presentation
this morning, very effectively. I don’t need to say anymore
about what those issues are. But we also have when you use these kinds of tools for surveillance it really increases
the power of the police to conduct pervasive and
intrusive surveillance at scale in ways that historically police officers have not been doing. Our courts are starting and
supreme court’s starting to recognize that this
is not just a difference in quantity of surveillance but can be a difference in kind where it may raise to levels of violating reasonable
expectation of privacy and triggering certain safeguards. But there’s also the issue of what kind of tracking capabilities do police have with facial
recognition technology. If your jurisdiction has a
public video surveillance camera set up with all the footage can the police use facial
recognition to try and track an individual’s movements
throughout the city or not in real-time throughout
their database of footage to track somebody and what
standards should apply there. Raises risks there both fourth amendment and also potential chilling
of free expression. One study looking at this more globally, the Carnegie Endowment put out a AI Global Surveillance Index Findings. You see here on the slide they found that at least 75 of 176 countries globally are actively using AI technologies for surveillance purposes,
including smart cities, technologies, facial recognition systems where it’s 64 countries
and smart policing. They also had a finding that governments in autocratic and
semi-autocratic countries are more prone to abuse AI surveillance than governments in liberal democracies. And finally, I know I’m
running low on my time. I wanna talk a little bit
about some efforts to challenge the use of AI for surveillance. Joy referred in her presentation today to movements to call
for a ban or moratorium on law enforcement use
of facial recognition. My organization, Open Technology Institute has joined in those
calls for a moratorium. There was a letter, a
coalition letter to Congress this past June from more then 60 groups urging Congress to enact a moratorium on any kind of Federal law
enforcement use of surveillance. There are also as you heard
referenced this morning in various jurisdictions
across the United States including locally, San Francisco and then also in Cambridge, Massachusetts they have now banned police use of facial recognition technologies. And there are movement to do that as well in other jurisdictions. And then just last week,
just so that you don’t think it’s just the United States, I don’t know that it’s gonna lead to a ban but the Australian
Parliament, interestingly, was considering a proposal for a national facial
recognition database. And they said: Hold on, wait a minute we need to overhaul this proposal, this raises a lot a concerns
about mass surveillance and the need for privacy safeguards, let’s take a closer look at this. And they called for that proposal to be revisited and redrawn for them. So, again, I don’t want to say that, I don’t want you to take away from this that I don’t think
government should engage in any kind of use of AI. But in particularly
where we’re talking about programs that really can take away somebody’s individual liberty or effect their access to benefits that can be even life saving it really is critical
to think very closely about what kind of safeguards
would need to be in place and how those should be carried out. So, now I take questions from here, right? (group clapping) (man speaking) – The question’s kind of more ’cause I’m ignorant about U.S. law. But from the European experience you need a law to allow the mistakes. In my country, Estonia,
we actually have an AI law that restricts this. Is there a legal basis that allows AI to perform these actions? Because I hear only of
laws that restrict AI. But if the use of AI even by city, state or Federal governments,
there’s no law that. – So, interestingly, so, no there isn’t. But, interestingly, these
local laws that I alluded to, the San Francisco ban
on facial recognition and the same one and also in Massachusetts those are actually part of,
the San Francisco statute is part of a broader movement to require almost exactly that. So, there’s an initiative
called: Community Control Over Police Surveillance or CCOPS that is headed by the ACLU
and there’s a model bill, you can find it online. And it is basically
designed to put in place a statute at the local level that require communities
and their governments to approve any type of police or in some cases it’s broader,
any kind of government agency use of a surveillance technology. Because right now in
the U.S. in particular, post 9-11, a lot of money became available for all sorts of technologies. And many communities the
police department would go out and say: Okay, free money, I can get these great
surveillance tools, stick ’em up. And the city councils didn’t even necessarily
know this was happening. And so, this is really a
movement to try and say we need to make sure our
community considers this, we have a public hearing,
we expressly authorize it, put in place a law that says: You cannot use any of these
surveillance technology unless you get express permission from your local government. And the San Francisco
ban on facial recognition is actually a part of a broader law that sets up that kind of system to put those kind of
requirements in place. – It’s still negative as opposed to a law that defines what can be done. – There’s more components to it. It requires safeguards, it
requires a data use policy and data retention. So, I’m oversimplifying for time but yes, there’s more
requirements than those. – Would you extend it to
the private individuals who use Ring doorbell
or self-driving cars, collecting license plates or? – So, again, focusing on government use, my concerns with the Ring is more what the police have access to. So, there’s been a lot of reporting on how a lot of local police departments have signed up in the
program with the Ring to get access to that footage. And there really aren’t any kind of legal safeguards in place. And so, we have a lot of concerns about whether that is being used as kind of an end run around
fourth amendment requirements because the police themselves
aren’t recording the footage, they’re just getting access to it and it’s very unclear what
kind of rules they’re applying. But when law enforcement
gets access to that, to those Ring doorbell cameras, yes, that raises very similar concerns. – So, is your, it sounds like
you’re saying no, to these. Not necessarily say, say if
there were stronger validations like benchmarking or some analog to that. And say, a better understanding of the strengths and limitations, even thinking about the human services and say, Medicaid entitlements. With those things in place do you think it would, is that the point at which we reconsider these technologies? Or do you just feel that in the next five years
they don’t have them? – So, I would distinguish,
this is very case specific. In the context of facial recognition right now we are not in a place because of the concerns
for perpetuating bias, the concerns for inaccuracy and the extreme power of that technology. Right now we are not in a place where we could say, okay, with this list however long it would be
of safeguards in place, we’d be good to go. We’re not there yet. With regard to some of the public benefits I think there are safeguards. So, a classic thing
people talk about having a human in the loop. David talked about having
a human alongside the loop. If you’re not actually making
the final determination but if you’re using it as
an alert system to fraud, this is a case that needs investigation but then you have and you have
better means to challenge. – So, more like a prosthetic than an actual decision being made? More like additional context
to inform a decision versus– – So, that’s gonna raise far
fewer risks, right, right. That would raise far fewer risks. So, I, I, I, right. So, in the context of facial
recognition in law enforcement that is a place where we are calling for an absolute moratorium right now. In other contexts it really depends, it’s very context specific. – So, you have no ethical objections if facial recognition
software was 100% accurate you wouldn’t feel there
are any ethical concerns about surveilling people 24 hours a day? – Oh, no, no, no, not at all. I hope I didn’t convey that impression. No, that’s one concern. The accuracy is one concern. No, there’s a lot a concerns about bias. There’s a lot a concerns
about perpetual surveillance. For example, if you are
going to use it for tracking then we would say you need a
forth amendment search warrant in order to track such an individual. And that’s just one safeguard. No, there’s a whole host
of concerns, absolutely. – Questions? I’ll jump in there.
– Sure. – So, I guess there was
an interesting exchange and I think actually between
John and Dave this morning and Mori Chisaki really about kind of who, I think this goes back
to the earlier question about the burden of regulating and kind of the timing of it. And the exchange for those of you who were not necessarily in the morning session went something to the effect of while, there are some beneficial uses even and John who said in the
facial recognition context. And so, are we really matching the right regulatory
intervention here of a ban with what we know about
the potential risks. And how do we think about in this space the right timing of when
to actually intervene? When do we know enough that we should call for
a blanket moratorium as opposed to something
that might be more tailored in terms of the prohibition
for particular purposes? (Dan speaking) – Sure, I mean, again, I would say I think it
is context specific. Law enforcement use of
any of these technologies is going to raise a lot higher risks than a benefits determination although a benefits determination honestly can be a matter
of life and death. So, in some cases maybe
that’s at the same level. But I think when you’re looking at a very serious downside risk then we need to err on the side of delaying implementing these
very powerful technologies. And so, that’s where the call for a moratorium on facial recognition in the law enforcement context comes in. I recognize that these
are challenging questions. And I don’t purport to
have all the answers on exactly where you draw
that line in each context. – Is anyone government
actually thinking about how to bring in the public voices to the, ’cause it sounds like there’s
many different contexts. Each of those contexts has different legal
stakeholders on both sides. One we should be
particularly concerned about is the impact on society. Is there anyone thinking about helping to bring those voices into understanding these contexts, sort of rifling through
them systematically? – So, Brittny’s gonna talk
about the New York Task Force. I’m not aware of any
other task force like that elsewhere in the country. But that’s one great
effort that is going on. So, maybe I’ll just say that
she’s gonna talk about that. Thank you.
(group clapping) – Well, it’s a real pleasure to be here and to be back at Stanford. It’s nice to see Margaret Levy here who was Director at CASMS
when I was there for this year where I’ve done a lot of thinking about the things that
I’ll talk about today. So, it’s wonderful to see
her and see Thomas as well. Yeah, if you can take it it
gives a PDF, that’s great. But, okay, that’s fine, it’s fine. I’m not sure what altitude I’m gonna present these findings to you at whether they’re at a very high level or at a very granular
level, you can decide. But what I wanna do is make an
argument about four obstacles to algorithmic government among citizens and among citizens’ willingness to support the use of algorithms in
government decision making. I don’t wanna begin or
end, begin the presentation or have you end the
presentation with the assumption that I think that citizen
consent is, in fact, the primary obstacle or the primary mover of government to using AI. In fact, I think most of it can happen without citizens being aware of it. But I think it’s always useful to think about the case in which, the circumstances under which something becomes under
high public scrutiny. And then it becomes a basis on which widespread opposition
and widespread support for government actions are employed. So, what I wanna do is present to you some data from multiple countries, mostly from my own country of Canada on what things underwrite citizen support or citizens’ opposition to government employing algorithms in government. And I’m looking very much
forward to talking about it. I’ll just put a couple a
things in the table to start. One is that there is a very good case for government employing
the use of algorithms to systematically make
decisions of various sorts. Obviously the administrative state is becoming more complex not less complex. That’s happening on a lot of dimensions. It’s making more decisions every day. But it’s also, I think, finding it more difficult
to make decisions as the capacity for government
to act in a positive way is more constrained year over year. We rely on bureaucrats,
that is to say humans to make a massive number of decisions and to implement policies. Humans are highly imperfect people overcome by any number of cognitive biases that will keep them from making decisions in a way that people designing
policies or processes or indeed bureaucracies might
expect them to make them. It’s clear as well that more
procedurally fair processes don’t lead to better decisions. I was telling both David
and Dan on the sideline that I’m always kind of really struck by these graphs that I’ll see
Dan put up in presentations about the complete lack
of a correlation between a seemingly procedurally
fair method for adjudicating some pretty constrained set of decisions who should get VA benefits or
who should get social security and the absolutely
massive degree of variance in the decisions that are made. You can set up a process with humans that’s meant to be fair
and to be evenhanded and it won’t have any
bearing on the actual quality of the decisions that gets made. And that’s a very good case
for automating some decisions. The state, it seems to me,
is not designed to learn. That is to say that it’s not clear that there are very
well designed mechanisms for one bureaucrat to
learn from the decisions and then the outcome of the decisions that are made by another bureaucrat. So, for all these problems, to some degree there’s a case for algorithmic decision
making to address these. Whether there’s a sufficient
basis of citizen support is a whole other matter. Here are some of the, just to put a few more priors in the table these are, I think, some of the problems that are gonna limit
the use of algorithms. Obviously, there’s from a
policy maker’s perspective and certainly from the
perspective of politicians there’s a concern about supervision, about the capacity to actually understand what decisions are being made. Wrapped up in that is a whole series of principal agent problems. And explainability may
even be a case of those. Then there’s the issue of implementation, of actually wondering how, outside of the exception
of very entrepreneurial, almost bespoke solutions that
seem to be the norm right now how it is that governments see through the widespread implementation
of algorithmic decision making and algorithmic allocations in government. And then there’s this final authority problem
of citizen consent, that underneath everything
government’s doing is a whole series of bargains that government has made with citizens about what’s acceptable
and what’s not acceptable. And there’s a whole bunch of assumptions that are built into that
which aren’t written in law but which I think guide
the mutual understanding between citizens and politicians
about how things will be. I’ll use just one example of this. It’s not written in law that our, kind of our speeding enforcement
mechanism on highways will be a system in which they’ll be a low probability of capture and then a high fine upon capture. But we’ve all kind of agreed
that that’s the way it works. You can drive 20, 30 kilometers
now over the speed limit and you have some chance of
getting caught with some fine. If you drive a lot more than
that you’ve got a chance. And the fine’s a pretty big one if you’re going fast enough but most times you won’t get caught. From a purely performance perspective we’re at a place where
it would not be difficult for governments to be
assigning speeding tickets to every person who
speeds in every instance. And imagine moving to a
model in which you’ve got almost perfect enforcement
and very small fines. The amount of money people
could pay could be the same. It may have a different
effect on behavior. But the nature of the
bargain between citizens and law enforcement in that circumstance is very different. People are being surveyed all the time, their behavior’s always being corrected. It seems to me that that’s
not written in the law that it’s impossible to do that but it is a case of citizens
consenting to one system and they may not consent
to a different one that relied on technology at its core. So, what I wanna do is
I wanna just highlight what I think are four
challenges to citizen consent for algorithmic government. One is that citizen support varies justifications
for the use of algorithms but it’s hard to find a
full set of justifications that are strongly supportive. And I’ll give you a little bit
of evidence a that quickly. The second is that citizens evaluate in the little experiments
I’m going to present here nearly any algorithmic
innovation negatively versus the status quo. And I’ll give examples of
some very reasonable, I think, algorithmic innovations
then you’ll see opposition. Third, citizens don’t
develop trust in algorithms dependent on algorithmic performance. In fact the standards they use to judge performance of algorithms appear to be much different
than the standards they use to judge humans. And finally, we’ve got
fairly good evidence that to the degree to which citizens are opposing of algorithmic government it’s correlated with their larger fears about the effects of
automation and AI in society and the effects that it’s
going to have in the future. And from the perspective of public opinion I’ll tell you why I
think that’s problematic. Okay, let’s get through
this stuff quickly. The data I’m presenting
to you come from Canada, the U.S. and Australia. This slide comes from Canadian
data collected this spring. We gave people, explained to people what algorithms were in a survey and told ’em there’s a number of reasons government may use algorithms, gave them all these reasons and asked ’em whether
they support or oppose these being the reasons that people use, the government might use algorithms. There’s two stories you
could take away from this. One is that, for the most part, the majority of people support
any one of these reasons if you take them on their own. And these are all pretty good reasons to employ algorithmic decision making. But it’s only something
like a quarter of citizens who support all reasons and 20% of citizens only
support half the reasons. The bundle of reasons that you might give for why government should use algorithms depending on that bundle will
have wide or narrow support. These are all very good
justifications for using algorithms. But none of them are supported
universally among all people. And I want you to think about what kind of challenge that creates for policy makers who
are trying to justify why they’re using
algorithmic decision making. So, that’s barrier one. Barrier two is the following,
is that status quo methods are preferred to algorithmic innovation. So, without getting into
too much of the details let me tell you what we did. We presented people with
a existing policy area which has some combination
of an algorithm. It might just be a scoring system and a human who’s involved in it. We explained it to them. We did it with three areas: Immigration decisions, tax audit decisions and decisions about the allocation of small business loans by government. And all of these decisions
in the Canadian case have a human as the final decision maker. And we, in each case,
describe these things and then describe innovations
that might be used to cutout the human at
the end of the process. So, I’ll give you an example. With immigration, for example,
we explained to people that the way immigration works in Canada is that people are, they
apply and they’re scored. And this has always been the case for our economic selection of immigrants or for a long time, thanks, Dan. And then we said, you know, there’s other ways you could do this as opposed to having a bureaucrat
make the final decision. You could have a lottery among those who have a high enough score which is, in fact, an
algorithm, of course. You could just choose those
who have the highest score irrespective of any
decision by a bureaucrat. And you could have some
further interviewing and have an algorithm
analyze those things. These are all effectively
just proposed innovations. There’s dozens we could have. The story is that every one
of those is opposed less than not imposing an algorithmic
improvement on the process. It’s the same thing if we ask people about algorithmic innovations to small business. Every one of those
innovations that we propose is supported less than the status quo. And we talked to them about tax auditing. Again, it’s the same thing. Every innovation is supported
less than the status quo. You might see a pattern here about how people respond to the idea that a computer or an algorithm may take the place of a human
decision maker at the end. The third thing is that
reputation building is difficult in algorithms. So, there’s been a fair
amount of work on this. We just did a little example in studies that we did in
Canada, the U.S. and Australia. And the basic ideas were
telling people about a hospital in which the decision to
assign surgeons to patients is made either by an
administrator or an algorithm. It’s hard to imagine a less sexy title you could give a human
than an administrator but I guess it gives people some maybe the faith in their competence. But what we did is we randomized people to get an administrator or an algorithm. And the basic story is that there’s a surgeon operating on a person, this is a very obvious
variant of a trolley problem. And that surgeon has a choice to either, there’s a decision made
whether that surgeon will continue to operate on one person or operate on five who have just come in and there’s a utilitarian logic there and there’s an ontological logic. And depending on your moral tastes you’ll support one or the other. But we randomized the
decision that’s made. The outcome is determined,
conditional upon that. And then we ask people
whether they support the decision made by the
administrator or the surgeon. The basic story and if you’re a sniper you can read this at the back but the basic story is that irrespective of the nature of the decision when the decision is made by the algorithm people are less supportive
of the decision. They’re in less agreement with it. Irrespective of the decision made, right? So, controlling for the
content of the decision. It’s the same thing in three countries. When we ask people then: Would
you support that algorithm or the administrator making
the decision in the future? Again, irrespective of
the nature of the decision whether it was utilitarian or not people are less trusting of the algorithm. And here’s the most
important takeaway finding, is that even conditional upon people agreeing with the algorithm they’re less likely to
trust it in the future to make a decision that
they would trust in. Which is to say that people
aren’t evaluating algorithms the way they evaluate humans, right, on actual performance and/or on some inference about judgment. The final thing I’ll just say is this, is that when we ask people
about their apprehension about the effects of automation and AI on job security and on prosperity, when we ask them if they have a belief that automation and AI is going to lead to more social inequality,
to less social mobility those things are correlated
with one another. And all three of those beliefs at least about job loss,
about increased inequality, about hampered social mobility are all correlated with
a greater opposition to algorithmic government. That matters if you think that the types of political entrepreneurs who will marshal up support
against algorithmic government may wanna tie it in with
other broader trends that people are uncertain about. So, these are the top
lines from the findings. And I’m happy to talk about them more. And thanks, Dan, very
much for the invitation. (group clapping) – Hi, that was great. Thank you for all that data. And I’d love to see the charts later. – Sure, sure. – Not a sniper. But I did have a question
in thinking about other algorithms and models that could be used like auto pilot. I was just on a plane yesterday, something that’s now totally
accepted, everyone has it. – Yeah. If you’re not on that 737. – Is it a time factor that we expect that as this becomes more, it’s kind of like the eagle that I know is better than the
eagle that I don’t know. Is it something where we
anticipate that over time as some of these wins are more publicized and they can communicate in better ways, we all learn as technologists
how to communicate do you expect or have you
seen other models where over time this kinda survey
might change significantly? – So, that’s a good question. So, it seems to me there
are a couple of things that are going on. I mean, one is that I think
from the perspective of citizens if you sit down and you
talk to a normal person about algorithmic government I mean part of their
question is simply gonna be: Can I get my license plate
renewed online, right, or can I get a health card online and I’m lucky to have health
care, it’s in America. Can I get some government
service online, right? And that’s digital government which is really a distinct thing. And I think that the appetite for that is getting higher and higher. So, that service experience
gets better and better and people are more supportive of it. I’m just not sure that
a generalized support for the use of algorithms
in government is possible. It’s not something that
people, I think, understand that you would have all departments trying to employ different
forms of machine learning and artificial intelligence
to make better decisions. What I can imagine is very
strong pockets of opposition to different types of
government employing of AI in ways that are uncorrelated with people’s views and other things. So, I’ll give you an example. I mean, you can imagine people in a very high salience environment, or high salience area like
taxation or tax policy being just highly
opposed to the government to basically fully automating of taxes and then just assigning
of tax bills by algorithm which is entirely possible. Irrespective of whatever
views they may have on whether the government
should be using algorithms to make small business loans allocations or to do college admissions or whatever. So, in terms of, that’s
all a long way of saying that it seems to me that
the opposition to it can be very acute and very uneven. While everything is progressing you can imagine there being
very strong opposition to small numbers of things
that are highly visible, right? And those could be the ones which really effect the bottom line of
government or not, right? But I don’t think that kind of a broad social consensus around algorithms is necessarily in the offing. – You’ve got a number of questions from this side of the room. (Dan speaking) – So, this follows right on
what you just said, Peter. So, I’m wondering how much this opposition at least as you’ve measured it matters. So, how does it change
into behavioral dissent or behavioral consent? Or how does it formulate into opposition that actually has political salience? And you think about
the pace of the pilots. – Yes. – I mean, we didn’t even,
you know, that just happened. We didn’t get asked about it and we can’t do anything about
it except not get on a plane. And I can imagine all kinds
of government operations beginning to develop with and then we either learn to live with them or there is a movement of some kind. But I’m not seeing how this particular kind of measured opposition
translates into action. – Yeah, I mean, I’ll give
you a particular example in a particular area but then
a logic for why it matters, why it matters more broadly. So, in our own immigration
system the way it works is that most people
are economic immigrants selected based on ability. And upon being accepted you can then engage in family reunification
like you can here where you’re bringing in your family. Many more applicants for
family reunifications than there are spaces. So, the way that the
applications get analyzed is just by order in the queue. And order in the queue
is determined by how much you’re willing to pay an
immigration consultant and how good the local consulate is in the town that your
family’s applying from. Government of Canada said, this is crazy, we should just have a lottery which is the most fair way
of dealing with this, right, if you’ve got a list of people
who are qualified to come who are longer than the spaces you should just randomize
over top of them. The opposition was enormous to this. It seemed unfair to people that this decision was being made effectively algorithmically rather than being made by a human who was deciding which ones or by some kind of effortful
thing that humans were doing. Completely upended family
reunification policy and queues for a couple of years, right? So, you can get it in
high salience policiers where people respond negatively. But I think the level
at which it can impair, let’s say political courage by politicians is that depending on your system is that they’ll think
about what types of things are likely to lay landmines for them. They’ll watch colleagues
go through these things and then not touch with a 10 foot pole the things that they think can make them go through
the awful experiences that a colleague has gone through of having one acute area
being highly scrutinized. Those areas are probably ones
that have some combination of high exposure to citizens. So, lots of citizens are
interacting with them, high salience to those citizens. And a more or less well,
okay, functioning system that’s making them work
at that point in time that doesn’t necessarily
need a complete overhaul. But if you look at, I mean, I guess the final thing I would say is that the evidence for why, the evidence that we’re
not really able to adopt it is things like the VA. There’s no good reason
why so many decisions would be made so slowly
and with such little effect if you could make them in an automated way except that there must be some other kind of opposition to them. And there may be other things there but I think that the potential for highly salient political opposition that basically backlogs and slows down the progress of a whole administration or a whole department for
years is a big matter. – We will still have time
for general discussion. So, let’s batch the last two questions. – I’ve never convinced Margaret with an answer on the first go, so. (group laughing) (man speaking) – I wonder if you can speak a bit about what you think that this
opposition really is. Is this, I’m talking
about if this is totally against innovation or algorithms per se, an informed fear of these on the number of (woman speaking) now than in previous
decades for obvious reasons about the. (woman speaking) Or perhaps an enduring
trust in the sort of like pipes of decision making
that humans can make along the lines that
you guys talked about? What is this opposition really about? (man speaking) – That’s actually a great
question, this is good. I mean, I wonder the
degree to which people simply don’t understand
the scale of government and are under the impression
that the government with people is the best form of government which is to say you should always be able to talk to a person to
help them with your, help you with your problems. I think that’s actually at
the core of a lot of it. So, that doesn’t necessarily suggest a hostility to technology but it suggests an
attachment to an older notion of how you ought to interact
with your government. – I think we’re gonna, save questions for the general discussion. (Dan speaking) – Great. Thanks, very much. Yeah, thank you.
(group clapping) – All right, good afternoon, everyone. Thanks so much for joining
me and Dan and David, thanks so much for having me today. So, my name’s Brittny Saunders. I work for the New York City
Commission on Human Rights. My presentation is, I think, really designed to really bring us down to a very specific local
government example, an example of a particular
local government agency that is working to try to
address some of these issues and grapple with precisely
some of the same questions that my co-panelists have
raised this afternoon. And I have a lot here
just to kind of give you a little bit of context around what the City Commission
on Human Rights is. I’m gonna speed through it a little bit so that we can kind of dig into some of the questions that were raised. But the agency has a
really interesting history. So, it was founded in 1944
after an uprising in Harlem which is a neighborhood
in Northern Manhattan on the part of black
residents who were concerned about conditions of
overcrowding, of police violence, of housing discrimination which were common
experiences for black people in the Northern U.S. at the time. And so, this first response on the part of New York City government was to create something
called the Committee on Unity. It evolved over time
into what is currently the City Commission on Human Rights. So, we are a local government agency, just like, I suppose, the
fire department or any other except, obviously, with
a very different mission. So, our mission is really
to enforce the city’s very broad and very robust
anti-discrimination protections. So, we have a local human rights law that protects against
discrimination in housing, in employment and in
public accommodations. We also have protections
for something called: Bias-based profiling by aw enforcement. Which was put into place in 2013 in response to a pattern
of discriminatory policing. And then also, something called:
Discriminatory harassment which is almost like a civil
version of a hate crime. I’m not gonna spend too much time on this. I mean, I think it might be of interest but I won’t actually read through each box but it’s just a little note
on the commission’s structure. So, we have a community relations bureau that is staffed by folks who do workshops and trainings with members of the public that are really designed to educate folks about their rights and their obligations under the city human rights law. We have a law enforcement bureau that’s staffed by attorneys who will take in and investigate
complaints from individuals who’ve experienced discrimination under the city human rights law. We have an office of mediation and then an office of the
chair which is where I work that does a lot of
inter-agency partnerships and also promulgates rules and guidance and those sorts of things. Just to make the point about
the expansiveness of our law, as I mentioned, the major
areas of jurisdiction are: Employment, housing and
public accommodations. In many ways the protections that we have kind of mirror what you
will see in other state and Federal anti-discrimination laws, so, we have protections on the basis of: Age, race, disability, gender, which we define to include
gender identity and expression and many other different
bases or categories. But we also have protections
that you don’t see in a lot of other statutes. So, for example, in the employment
space we have protections on the basis of arrest
or conviction history. We have protections in the housing space for individuals who might use a voucher or some other lawful source
of income to pay for housing. And so on and so forth. I’m not gonna spend too much time on this. It’s just to say that our kind of relationship
to interpreting our law as it relates to emerging technologies has kind of grown over time. In around 2015 folks in the commission, this is actually before I
came to the commission myself but started to notice
that they were seeing just through news reports and research more and more use of algorithms
to aid in decision making related to areas of our jurisdiction. So, taking, for example, things
like tenant selection tools that landlords or property
managers might be using to help them identify who
might be a good client or the more common example
of candidate selection tools that employers might be
using to help them understand who they may wanna invite
back for an interview. And for us, of course, this
raised questions about whether the folks who were designing
and implementing these tools were doing this in a way that was mindful of the kind of extensive protections that the human rights law offers. And so, one of the areas
where I mentioned before, actually, I think I just
gave those two examples. So, I’m gonna speed
through just for speed. So, thinking about the employment example and tenant selection. So, that’s kind of how the commission started to engage around these issues was through looking at kind of both, largely, I think, private
use of these technologies and trying to think about what is it that we can be doing to
help folks understand not just that the human rights law exists but how they might go about avoiding violations of the human rights law when designing and
implementing these systems. At the same time there was
a lot of interest developing amongst advocates and subsequently
within the city council around government use of what
ultimately they came to call: Automated decision systems. It’s, I should note here that the term, automated decision systems or ADS is really, really broadly defined in the local law in question
which is local law 49 of 2018. So, it’s really very broad, encompasses artificial intelligence but also encompasses a range of other innovations
that people might use that will be much less sophisticated. But the place where the
council eventually landed was to create a task force, basically requiring the
mayor to pull together a group of folks who
would do some thinking around recommendations for how the city should approach the use of
automated decision systems. And I am one of the
co-chairs of that taskforce along with the head of the
mayor’s office of operations and the head of the mayor’s
office of data analytics. And we have been convening
since spring of 2018. And I’ll give you a
little bit of insight now into how we have approached that work. So, the mandate before the task force was pretty significant. And this is per the local law
that I spoke about before. So, the task force is charged with developing recommendations
around a procedure for how impacted individuals
can request information on decisions involving
automated decision systems, a procedure for determining
whether there is a system is having a
disproportionate impact on the basis of certain
protected categories, procedure for addressing instances of harm from systems that have a
disproportionate impact or are determined to have
a disproportionate impact. A feasibility analysis of
archiving agency systems and associated data. A process for publicly
disclosing information about agency systems and criteria for identifying which systems should be subject to one
or more of the above. So, as you can see, any
one of these things, right, is quite a complicated
and challenging endeavor. And it was our task to or
it has been our task rather to try to address a set of recommendations around all these things. So, how did we go about doing that? So, the first step, of course, was to convene the task force. We pulled together a set of folks who have basically three
kind of buckets of expertise. So, we have folks who are experts on computer science and data science. We have folks who are advocates around a range of different issues and in support of a range
of different communities basically based on the recognition that the knowledge and the experience that those folks have developed advocating around those issues
is gonna be relevant here. And then we also have a number of folks who are actually staff at
different city agencies because, I think in my opinion, that’s also a type of expertise
that’s really important when you’re trying to
develop recommendations for how to go about kind
of governing these systems. So, a little bit of insight into which agencies are on the task force. So, we have the New York City
Department of Transportation, Department of Education,
Department of Social Services, the Mayor’s Office of Criminal Justice, the New York City Police Department and the Administration
for Children’s Services. And so, what this reflects is both a diversity of agencies
with different mandates. New York City has something
like 50 different departments but these represented, I think, some of the largest and
most influential departments and then also some of the areas, frankly, where there has been
a lot of public debate or concern about the use of
automated decision systems or algorithms or artificial intelligence, whatever the case may be. And, yes, so, that’s the membership. And someone had asked the
right question earlier about public engagement and I’ll say that I think as
soon as legislation was passed we recognized that there
was going to be both a need and an opportunity to
engage residents of New York beyond the folks who kind
of think about these issues on a day to day basis
because they are researchers or because they are government officials or because they’re advocates in the space to engage folks in the
conversations about this. We went through this a
couple different ways. So, one was that we did two
large public engagement sessions in the spring of this year where we had folks with a
range of different expertise come and basically offer insights and recommendations to
the task force members. And we also had
significant periods of time set aside for public engagement as well. So, for questions from
members of the public. In addition to that we also
did a set of community sessions where we went out and
connected with young people or residents of different
parts of the city to get their perspectives
and their questions and their concerns around
these questions as well. I am now running out of time. So, I will just go to my last slide which is just to say
that we are gonna produce a report with that set of
recommendations in late November. So, you should look out for that. But maybe I’ll stop
here and take questions. And then we can transition
into the broader conversation. (group clapping) – I have a question. I know there were earlier
issues with the bidding not being given access to the data or the insights that they needed. Are there updates around that? – Sure, so, for folks who
didn’t follow this or don’t know there were some task force members who felt that they needed to have basically access to current city systems in order to develop recommendations around how the city should think
about these systems. And we did end up having
some city agencies come and present on systems. For example we had the
Department of Education and the Department of
Transportation come and present to the task force in order to provide some additional insight. I think it was helpful. I think it helped to strengthen
the process, so, yeah. – Thank you for making me
home sick for New York. When you make your recommendations, what do you hope will be the next step? – I think that’s a great question ’cause I think it’s a
really important point that this is not the
end of the conversation in New York or anywhere else, right, like, this is a initial
kind of starting step, a set of recommendations that I think we anticipate folks
will hopefully build on. It’s hard for me to say, in large part, because it’s not up to
me what happens next. But you can imagine some of the pathways that usually end up emerging. – Who does the report go to? – It goes to the mayor. And then it goes to the
city council as well. So, they’ll each consider
what steps they wanna take based on the recommendations. And then I’m guessing there will be a lively public dialog around
the report itself as well. And so, I’m sure it’ll be more, more to come out of that as well. – There were two large sessions
that involved the public and then there were other community– – Yeah. – Over what time period
was that and how many -ish? – Sure. So, that was between April and September. And then we did the two large sessions and about half a dozen other
sessions across the city. – So, it’s a very small,
representative sample of a city of eight million plus. – Yeah, you know, you
have to start somewhere. But it’s true. And I think one of things that we’re certainly thinking about is how to engage a broader set of New Yorkers in these questions. I think it’s not simple. One of the things that I
have observed in this process is that for a lot of reasons, right, not the least of which being the kind of current Federal context in which a lot of the communities that we would want to engage are under attack in a lot of ways. It, I think, can make it
challenging to try to get folks to necessarily invest
really limited capacity under really difficult
circumstances in this. But that doesn’t mean that
there isn’t room for more kind of innovation and
thinking and collaboration and going straight to people
and all sorts of other things to kind of get over that hump. I think, did you have a hand? – Yeah, I was gonna ask, have there been conversations
at the local level around potentially updating
the human rights laws that you’re using to ground
the commission’s work to delineate more specifically around sort of some of these
technological advancements, the way that it’s happening
sort of in the global level around if we start naming
every potential right or every potential violation of a right then are we softening
sort of the body of law? I’m wondering if that’s
happening at the local level. – That’s an interesting question. I mean, I think we feel like the law as it’s currently written
certainly is pretty powerful and could be used for a
lot of different purposes. And in some ways I like that flexibility because the more you enumerate the more you raise questions about the things that
you’re not enumerating. But, but yeah, I mean, there is some of that
conversation happening. – I guess I’ll jump in with a question. It sounds like you’ve gone
through quite the process here in terms of the way we
think about the membership and community engagement. And I guess if you were called up by a sister jurisdiction that wanted advise since this is kind of
the first of its kind about how to regulate in the
space and how to create it and a process like this for a municipality to really think about the right way to protect equity founders and accountability kind of principles based on what you’ve gone through. What wisdom would you have to impart on your sister jurisdiction? – I mean, I think the first
thing I would say is, like, it’s going to be hard. It’s really challenging especially from a government perspective because you really have to, I think sometimes what people don’t necessarily appreciate is that when you’re working in government particularly in local government you’re nested in all
these different regimes that kind of govern how you work, right, so, you have whatever Federal laws might be applicable, state laws, local laws and
their not always well aligned, there’s not always a ton of effort to think about how different
systems relate to each other and you’re operating under
resource constraints. But all that is to say that it will be challenging
for all those reasons. It’s challenging because as government you have a special
responsibility to kind of hold the values around equity and transparency and accountability. But those concerns are kind of
distributed across the board. So, like, there are, obviously, the very serious equity concerns that arise from the sorts of issues that Sharon talked about, right? So, you have data that
is kind of capturing decades of discrimination and then you’re using that
data to train a system and that has the potential
to replicate all sorts of, replicate or amplify all
sorts of disparities. But you also as government
if you are in charge of getting people critically
important services, right, you really do have a
responsibility to think about how you can do that more effectively and more efficiently and
that in and of itself has equity implications
depending on the services that you’re talking about. So, it’s incredibly
challenging as it’s, obviously, maybe not entirely surprising
but helps to set expectations. Two, I think it’s really hard to overstate the role of trust in this. So, like, the communities that are impacted by these technologies and that we wanna engage
in these conversations also often are communities that have had really difficult relationships
with government over time. And so, you know, and then, of course, there’s a current context issue. And then, of course, the examples of, some of the most alarming examples, right, of how these technologies
can be used in the wrong way. It all kind of compounds that kind of lack of trust and the need to kind of build through that and work through that is something that I think
it’s hard to overstate. And I think, gosh, I think, I often think about myself as
a really optimistic person. I’m sounding really down about it now. But I am actually optimistic
about the potential to kind of thread the
needle and get this right. But I do think as challenging as it’s been for us through this process I think we did take the right tack by bringing together a group of people with kind of the three types of expertise that I spoke about earlier ’cause I think it is absolutely essential if you wanna put together
a set of recommendations that are actually going to be something that could be implemented. – I’m curious in terms of, like, more of the capacity to say,
look at health services. For example, like, I could
imagine that being difficult ’cause there’s so many private people and private companies in the space and they could be using their
own risk stratification tools and say we have quality
measures for facilities but they don’t necessarily write that down by the individual characteristics. So, wondering if you
have anything going on in New York City around more population, health management, and measurement? And if you could maybe speak
to some of the challenges in that particular area. – That is a really, really great question. I will say that I would not be surprised if the Department of Health
itself were thinking about this but I don’t have a ton
of insight into that, unfortunately, so I can’t speak to it. Yeah, they’re not on the task force but that doesn’t
necessarily mean that they aren’t doing a lot of
great thinking around this. And actually they tend to
do a lot of great thinking, so, I would look into it. (group clapping) – To kind of react. Do you see a particularly
promising pathway for accountability, for
instance, in this space? – I can start maybe
and see where it leads, see if it generates some conversation. So, I guess, you know, one theme coming out of this panel so far has been there are just a basic problem of sort of multiplicity. So, there’s this, like, there’s this the lure of a one size
fits all solution here. And they’ve even had a question, I guess, during my presentation they said: Hey, what is the answer
across the board here? And the reality is that there is gonna be a ton of domain specificity in this area. And so, you know, I think that the logics of government use of
AI at the Federal level are gonna be different from their use at the sub-Federal level. They’re gonna be different across the civil and the criminal divide. Within particular policy silos
they’re gonna be different. And then I think Peter added to that notion of that sort of
overall notion of multiplicity by pointing out that, like, the support and the justification for
these tools is also, like, really quite spread across a bunch of different possibilities. And there’s a basic
cycling problem throughout. So, I guess, I wish that as a community thinking about these
sorts of issues we could work our way down into more domain specific conversations more often. I do think that as we’re kind of, you know, muddling through here and trying to come up with a solution I do think that these conversations often get pitched at
a level of abstraction that’s not as helpful as it could be. And so, just to close
off then and say, like, take the two examples, the
two ground level case studies that I presented during my presentation. Those two areas vary tremendously in terms of the amount of transparency we could warrant over the tools. So, in the enforcement context transparency actually defeats
the tool to some extent. Whereas in the disability benefits context we probably want lots of transparency. In the enforcement context, maybe what we want is some kind of a system level accounting
of how the tool works. Maybe we think that reaches or achieves a desired level of political
and legal accountability. But when it comes to an
individual benefits determination maybe we want decision level accountance. We want an individual beneficiary to know what the provenance
of the particular decision is. Maybe we even want open sourcing in the context of social welfare benefits. And, again, the point is that the logics and the imperatives of
all these different areas where AI’s being deployed
within the public sector are just gonna vary. And so, that’s not very heartening because it means that there isn’t some sort of easy across the board one size fits all solution. Rather what is left for us to do is a lot of hard domain specific work. (man speaking) – The thing I would add to it is that if you tried to back out from this, two or three rules of thumb that a policy maker or a top
level bureaucrat could use to evaluate whether some
employment of an algorithm was good or not in some broad sense, it’s not even clear we
can do that in some sense. I mean, around non-discrimination
it is very intuitive. That’s an easy one for
lawmakers to understand. It’s a very good one as a rule of thumb. What’s the rule of thumb after that? It’s not transparency because
it’s not going to work in every instance, it’s not. It’s probably not ease of explainability because that’s gonna
defeat half the purpose of using machine learning
in some sense, right, to be able to understand
the ingredients of things. So, it seems to me that
a real challenge is that for people who are trying to build up broad political support for things and bureaucratic support for things, not being able to give them a series of easy rules by
which they would evaluate whether something is good or not makes the challenge
all the more difficult. Not impossible, but it
just makes it difficult for them to understand the scope and kind of evaluate the scope and the quality of the things that are being used in
there from their department. – I agree but it’s very context specific. And we can’t, one size fits all. But with your thing about explainability and not wanting to defeat that I wanna push back against
that a little bit. Or maybe you just mean in
terms of who gets flagged for analysis and the trigger. Certainly the underlying, are
you gonna be accused of fraud is something that we
very much need to have, people are able to challenge and explain and why did I get, you know, why did this bad consequence happen. They need to be able to test that and adjudicate that which is why I mentioned earlier I was very intrigued by your up front, can we have notice and comment rule making on algorithm before we even implement them to test that ahead of time because I think one issue that comes up in some of these contexts is the people who are making the rules or employing these are lawyers like me or other people who left
math behind a long time ago. And there’s a notion, I
think, especially sometimes when judges are looking at these things, well, math must be right
and I don’t understand math and a little bit of reluctance to test it. And, of course, the
formulas must get it right. And sometimes that’s gonna be true and sometimes it’s not. And we need to at least
in the certain cases be able to test these. You have your sample
maybe that they could test and make sure it’s running accurately even if you can’t dig
underneath in every case. Having that kind of due
process is not just important because it’s a legal
requirement in so many contexts but just to give people the confidence and the faith of what is this
magic math that’s happening, is it really working properly. – Based on, you know,
on some of the agencies that you were looking at you were using machine
learning technology. I mean, how’d they define
how they get value out of it? I mean, I think that’s
something that companies are struggling with right now, like most AI projects are stalling. So, I would just be even
curious, like, what does the SEC if they’re even having
conversations with other agencies and, like, look, we deployed this and we’re getting x, x, x, x. I mean, how are they defining value? – That’s a great question and it’s not, I don’t have, like, a definitive answer as to any particular agency. We’ve had a lot of conversations with a lot a different
agencies who’ve done this. So, I think in a time of
really deep fiscal constraint I think the real value that, my guess is what’s driving
a lot of the innovation that we’re seeing within government is an ability to do more with less. And so, it’s the efficiency rationale. Maybe in some of the
adjudicatory agencies, so, maybe at the Social
Security Administration if we had them here we could ask them. And maybe Dan could speak to this as well ’cause he’s interacted
with a lot of these folks within say, the SSA developing
these adjudicatory tools. You know, maybe they would
say that decision quality actually is what’s driving us. So, it’s less about back logs. It’s more about narrowing some of those inter-judge disparities
that I talked about. But I have to think that
in the enforcement agencies like at the SEC, I think the principal I don’t know if this is right or not. But that the principal lure
or allure of these tools is that you can do more with less because these are budget strapped agencies and they have big regulatory mandates. They have more stuff
than they can possible do with the resources that they have. And they wanna try to
maximize what they’re up to. – Yeah, for a number of Federal agencies there are these performance
measures that are set up under the Government
Performance and Results Act. And so, for an agency like the Social Security Administration they have performance metrics like the number of cases you’ve decided, the number of times your
cases have been reversed. And so, you’re really grounding yourself by achieving those performance metrics. And sometimes that’s good, when you’re directly able to
measure what we care about in terms of promoting social welfare. And other times, if you’re solely sort of maximizing your case load that can drive exactly
the kind of dispersion that sort of irked Peter in his response. And so, in these particular use cases it was about being able
to flag errors internally earlier through an
automated decision system. Then, you know, through
what is probably not a very sophisticated evaluation they claimed that, hey, through this tool we were both able to improve consistency, lower errors and speed
up case processing times. But one of the kind of worries is that if you don’t have something like transparent evaluation method it’s really hard to know
how much credibility to put into those kinds of
internally generated reports. And that’s part of the kind
of political dynamic too in terms of the adoption
of these kinds of use cases which is for a lot of government agencies it’s really high risk to use a number of
internal staff resources to develop something that you can’t say was a winner all around. And I think that really effects
where you end up investing. And at an agency like the
Securities and Exchange Commission one of the interesting
things to observe was that they actually did have more of the kind of culture of saying, we’re gonna seed six experiments, we’re gonna expect more of them to fail. That’s the right level of risk for us to actually make sure
that we’re building out in the right areas. – And just a quick rider
of what Dan just said which is, you know, the
political incentives that agencies are facing
are very different across agencies, across policy
silos and what have you. Some agencies are claiming
they’re using machine learning but when you put Stanford
engineers on the case they’re not. And so, in some cases there’s
incentives for puffery. We’re on the cutting edge, we’re at the frontier of
technological development of governance tools. But at other agencies they are using it but, man, they wanna
staff the political radar as much as they possibly can. So, it’s been really
interesting as we’ve fanned out across the entire Federal
administrative state to see some of those differences. It hasn’t always arisen in
places you would expect. You might expect, for
those of you who know a little something
about administrative law you might expect that
the executive agencies would be most politically sensitive, those are the ones who are under the direct control of the president whereas the independent agencies, the independent commissions
might be less so. But I don’t think that we’ve
seen any regularity there in that regard in terms of
the political sensitivities. – And the Stanford engineers
out of the 170 or so use cases rated the degree of sophistication
of Federal use cases. And I believe that about
eight percent of them were rated high in sophistication. So, it gives you a little bit of a sense of the capacity gap
between the private sector and the public sector. – Can you release those? I would love to see
the breakdown of, like, what is the most sophisticated
agency versus the least. – Yeah, those will be
part of a large appendix in a report.
(group laughing) You have more questions over here. – So, I’m hearing several
different issues being debated. One is sort of that
the level of what Chuck was talking about earlier
about what principles to use as opposed to the domain specificity or the technological specificity. And Peter began to have
a response to that. But I’d love to see that
explored a little more ’cause I can think of other principles than discrimination and
non-discrimination or transparency. But the thing that I’m also struck by is that the another underlying theme that I hear from most of you has to do with creating
confidence in these tools both within the agency, within the public and among political decision makers. And so, I’m wondering why there isn’t, I mean, I probably can answer
this question as I ask it but it seems to me that
this is a kind of area where a different kind of
strategy might be tried. And I’m thinking here the
example of Code for America or even some of the
behavioral insight units where you start with a problem which people really are
being inconvenienced by or they’re being denied benefits and you solve the
problem with an algorithm and give people or some
algorithmic part of the solution. And you give people and the
public and the decision makers some confidence that this can work in those kinds of contexts and settings and then you can build up from that. So, I’m not hearing strategies really. I’m hearing, like, this happens
here and this happens there. Are there governments
where there are or groups, I mean, other than things
like Code for America where these kinds of strategic
efforts are going on? – Yeah, I mean, I’ll take
the first crack at that. I think the work that Code for
America is doing is fabulous. I think one of the real challenges when you’re talking about
Federal regulatory agencies is that you have to serve
the public at large. And so, the principal
output by Code for America is usually an app. And that raises all sorts of
questions about the kind of digital divide across
different sub-populations. So, I think there has
been at least some caution about adopting something that really may have
disparate impact in some ways if it’s only people with smartphones who can kind of access a
more expedited mechanism of actually getting benefits. Brittny I’d actually love to hear you. I don’t wanna cold call you. But I think Margaret’s first
question was so interesting, right, because we had this, Mori Chisaki saying we have 128 frameworks for ethics in AI in Europe alone. And how do we take that
broad set of principles and reduce it down to
something that’s more tractable than an open ended call
for equity and fairness. And that’s so much of the
work of this commission that you’re up to. And I know you haven’t
released the report yet but it must be a struggle
that you’re facing in terms of not having the
lowest common denominator consensus position being adopted. How do you move this really forward in a kind of material way? – So, you’re right. So, we developed a set of recommendations that were going to be kind of applied to New York City government
across the board, right? So, that’s one of the challenges. I think, to your point about
how we tried to make sense of what’s kind of already out there. Like, we, as I mentioned, we had these public engagement sessions. Folks came and shared their insights, submitted written materials, some folks just submitted
written materials. We kind of pulled that all together and have been using that
to inform where we land. I definitely can’t tell you
where we’re gonna lie again. – We’re on the edge of our seats. – Yes, yes, exactly. It’ll be coming soon. But I think the answer is really probably unsurprisingly, iteration. So, you know, this will be
where we land in this phase. And then I am hoping
that we’ll be able to, we, ourselves, the City of New York will be able to build upon what this set of recommendations is. And other folks will take it
and do with it what they will. And maybe they’ll arrive at some of that kind of domain specificity and
all sorts of other questions and do that process as well. – And do the rest of the
panelists have a response to Margaret’s second question of: Do we just need a different strategy here, something along the
lines of Code for America or a Digital Brigade or
the U.S. Digital Service, 18F, right, different ways of actually fostering this kind of innovation from within the department? – I would say three
things very, very quickly. One is that I think public
administrative structure matters and understanding what the incentives are within different forms of government is gonna matter quite a lot actually. That’s the first point. The second one is that, you know, there’s if we think about the metrics which we wanted to classify things there’s all sorts of ones where citizens have no
interaction with government. And that might be the
place to really break out solutions as aggressively as possible. That works against the strategy of building citizen confidence because citizens aren’t
interested in the control or function at the background
of government or something but those are good places to test ’cause there’s a lot of stuff the government does over
and over and over again. But the third point
which goes a little bit to the speeding example that I gave and certainly goes to tax audits is that we shouldn’t think that citizens want government to work perfectly. Citizens are actually pretty happy that there’s a lot of stuff government doesn’t do well, right? Like, I think a lot of people
when they file their taxes are pretty glad government
doesn’t do it perfectly on the audit function. They’re certainly glad the
government doesn’t catch, you know, subway fare cheats perfectly or speeders perfectly. So, you know, there’s a degree to which I think we’ve gotta be very, in thinking about the strategy of it, being pretty shrewd about the areas where citizens want improvement in a domain and to choose those ones. – I agree. – And not to–
– Not just efficiency. – That’s right. To not presume that
efficiency’s actually one of the principal objective functions that citizens are trying to
maximize on in every instance. People don’t want there
to be perfect enforcement in a lot of things. – I’m just gonna a pick up on the notion of how you
make the decision making the earlier question here about
getting the public involved and how at least in your task force although I haven’t heard
of any other jurisdictions stepping up to try and
do something like that to bring in the public. But I did recall and I checked. My former agency was mentioned. I worked at the Privacy and
Civil Liberties Oversight Board. A tiny agency, reviews Federal
counter terrorism programs to make sure that they have privacy and civil liberties safeguards built in. One of the projects
they are doing actually is looking at the use
of facial recognition and other biometrics with regard to flying beyond the no fly list
but in that context. It’s not exactly a public, it’s not the same level
of bringing in the public but there is an opportunity
to hear from the public. And so, that is one context
in which there is that review. But I wonder to what extent, you know, think of your question
and your New York model there can be these. Because I think getting the
public’s input is so important whether it’s the, whether
it’s in public rule making or thinking that through and
how government can do that because also, I think, with the challenges from my time in government
of government procurement and what you’re doing when
you’re trying to think about building an algorithm in this new space I’m sure that that is part
of the problem as well. – Can I say one quick thing? It’s a great question,
Margaret, about strategy. And you immediately went to
what NGOs or what civil society can do to sort of help
shepherd this process forward. (woman speaking)
In responsible ways. But there’s also a certain
amount of strategic action at the agency level. And that’s been interesting
to watch as well within our project as we’ve
tried to push it forward. So, the SEC, I think,
has been quite strategic. And that’s precisely
because they do some entity, they have some entity level
infrastructure for doing this. So, I think they’ve been quite reflective about which tools they develop and when. And so, they’re, you know, gradually selling their
regulated community on the use of these tools. Social Security Administration
which doesn’t have quite as much as far as we can tell entity level infrastructure has also been quite strategic. If you think about which tools
they’ve actually implemented. They are, like, well chosen. So, the Insight system is
an error correction system for drafts of opinions. And that’s not likely to
get anyone’s backup I think. A tool that I didn’t tell you
about is a tool that tries to, it’s a machine learning tool
that identifies easy grants. That is to say, cases that the agency is surely going to grant eventually. And the idea is to identify them early and then push them to staff
level decision makers. And that then frees up time
for the administrative judges to focus on the harder cases. That’s also a great strategic move. If you’re gonna start to
develop a machine learning tool it seems like a win for everyone. It’s also very hard to challenge legally. It’s unclear how you would bring, who first of all would
bring a challenge to that. (woman speaking) It’s possible someone who
doesn’t get the benefit of the easy grant process could challenge. It’s not entirely clear
they can get legal standing, we’d have to think about
it a lot as lawyers to understand that. And so, the result is that the agencies are moving strategically in ways– – So, could I encourage
you two, with Tino, et al to write up the strategic part of it? – Yeah, sure. – I think that’s really important
for agencies to understand that there are strategies here. – But what you worry about then is: Are they really strategic? Which means, you know, our
government tech overlords are figuring out ways to
sell us on these tools as they work up momentum
towards more tools that are much more consequential and that maybe aren’t
quite as well designed and that actually create
problems, I don’t know. This is why I think we’re so wedded to this idea, an internal capacity. There’s this long tradition
in administrative law that says that actually sort
of internal administration is where a lot of the due process happens. It’s not imposed from outside by courts, there’s just too much going on within any administrative system for courts to really police it in any kid of thorough
going or rigorous way. And so what you have to depend on is reflective practice
by the administrators who wheeled this really
important set of public powers. And so, the hope is that
not only they’re thinking kind of instrumentally, strategically about how to bring these things to kind of pubic market,
to use by government but that they’re also thinking about the moral dimensions of what they’re doing and are asking often enough the question that Jeremy
Weinstein asked but he’s now gone about whether there’s
conversations within these agencies about whether these are appropriate
uses in the first place. – Question in the back. – Are there any tools which
really kind of benchmark or quantify when a AI tool can be introduced in a human ecosystem, the ratio of human versus AI interaction is there a way to quantify it? For example, you use technology (woman speaking) When a technology’s obsolete or when a technology’s
completely invisible (woman speaking) A microwave or electricity. We don’t feel that it’s there. Is there any benchmark (woman speaking) When it is the right time, the right ratio of AI versus human? Like a tool, do you have it? Are you building anything about it? – Are you offering to build it for is? (group laughing) I’m not aware of such a tool. But I think this ties in really nicely, I think, to kind of Margaret’s
question about strategy because we do know that
there are a whole bunch of preconditions that have to be there in order to even contemplate this and only then do you get to the question of like the right mix
of how much to automate and how much to leave to human discretion. So, it’s agencies that
actually have laid the backbone infrastructure to actually
capture a bunch of data. So, that’s the example of Gerald Ray of getting to the agency
at a point of time where everybody’s just using Word Perfect or Microsoft Word to write down decisions that aren’t capturing any
structured information coming out of any these decisions. And he starts to actually build
out that data infrastructure to even get to this point
of having a decision tree for how you decide a Social
Security Disability case. You have to have some folks on staff. And I think part of what explains the 50% are the agencies have really done, there’s not really much public evidence of any experimentation is there is a pretty substantial gap in terms of who sort of
agency’s are able to hire. And so, there are attempts now and this goes back to kind of the question of how to think about this strategically where we’ve now had
questions about moving people over to an exempted service or adding the parenthetical data scientist to a bunch of existing job classifications to make the hiring of
those people easier, right? So, there are a whole bunch
of these pre-conditions that I think have to be there for an agency even to get to the point. Now, if depending on
which agency you’re in maybe Sharon would think
that’s a good thing that those pre-conditions
aren’t there for some of them. But I don’t know if other staff feel– – I might just make this observation that when you see these estimates about what percentage of
jobs can be automated. One way that these get backed
out is that, for example, and I’ve done this in surveys where you ask people across
18 or 20 competencies, do you do the following things, right? And then we know at what levels
technologies are performing at current levels of demonstration. But, for example, if you
take the McKinsey filter which has got 18 questions not a single one of those questions is: Do you make moral judgements in your job? They ask questions about
reading and navigation and inference and giving instruction but there’s no questions
about moral judgment, right? It seems to me to be one of the key ones about whether or not
things could be automated and put into an algorithm or not. – I mean, fundamentally, it’s
what makes AI in this space so much more complicated
is that there is no, there’s not necessarily core agreement as to what the single
objective function is of a government agency because you’re talking about something as diffuse as social welfare as opposed to just, you
know, revenue and profits. And so, that’s what makes the law review’s decisions much harder. – So, if we go back to
pre-industrial times (woman speaking) Then what is the next point in our future. Before the production systems came, before the Henry Ford
production lines came it was a very different sight. To go back there, stand
there in beautiful cast and see how technology’s progressed then what will be the next point and where does AI fall in? – I think in a sense actually a lot of the panel’s discussion has been about trying
to get an understanding of where that frontier is
as a descriptive matter. And we could kind of try
to extrapolate from that. But I’m not sure if any
of the other co-panelists have a good forecast here as to– – I’m not sharing it. (group laughing) – Thank you. – Last question. Yes. – It struck me that you just said that the complicated thing
about AI in government is that the optimization
function is more complicated. And I wonder if to what
extent that is true. It seems to me that a
lot of the conversations that are going on here
at Stanford and beyond about opposition strategies in AI are precisely around the
question of, like, you know, how to sort of like solid trade offs from efficiency and equality to like all sorts of other like questions that are generating the
conversations that we’re having now about the impact of the use on society. And so, I wonder, like, thinking about a regulatory framework what do you expect that the sort of like the regulatory framework
for government use to be substantively or sort
of like structurally different from a regulatory framework for industry. – So, I think it’s a
really profound question I know if others wanna chime in here. I’ll say one quick thing which is just in the adjudication setting the very sort of rule that tells you whether you’re eligible
for disability benefits is completely humanly created. It depends on who sits
in the appeals court. And they may be wrong. And so, it’s a kind of complex system where you don’t have an objective referent at which you can know whether or not this decision is accurate, right? And so, I think that makes it just a much more thorny
measurement problem. Sharon. – I was just gonna say is it’s the same as the outset
of my little talk was the consequences when you’re
dealing with government are vastly different than when you’re dealing
with commercial uses. So, the government can put you in jail. The government has
entitlement to benefits. We have notions in our law of due process, meaningful notice and a right to be heard that apply to the government
that don’t apply in the, I mean, we would like to get due process, I’m sure from private
companies in many contexts. But it’s a very different framework and what’s at stake is very different. Don’t get me wrong, I think there’s also a strong advocate for
consumer privacy legislation. I hope we will get that
here in this country. And AI is one component of that. And we do need to have safeguards for what companies can do with our data, how they handle it, how
long can they retain it, can they, what kind of
secondary uses or sharing. There’s a whole lot of issues. But I think it’s a,
they’re different issues that in government just raises a whole ‘nother level of concerns that for me in many ways are more scary. – That’s that they’re
clear analogs, right? So, if you were thinking, we’ve talked a bunch about fairness. And that has been a really
challenging literature to try to understand, right? The ideal is, we write down a
formal definition of fairness. We choose that. We treat everything as a
constrained optimization problem and then we’ve met that
one definition of fairness. Well, what we’ve seen
over the past few years has been the explosion of 27 different definitions of fairness. We have the impossibility theorems, they’re mutually incompatible. And so, then you’re ultimately forced to make these kinds of normative decisions even in the private sector context if you’re trying to ensure that whatever private
AI tool you’re adopting doesn’t have a disparate impact. With that I think we’re out of time. So, let me– (group clapping)

Leave a Reply

Your email address will not be published. Required fields are marked *