WEBVTT 00:00:05.100 --> 00:00:07.050 position:50% align:middle - [Anna] Well, good afternoon, everyone. 00:00:07.050 --> 00:00:15.540 position:50% align:middle And I'm really delighted to be here, and honored to have the opportunity to actually present 00:00:15.540 --> 00:00:18.000 position:50% align:middle this research to you in person. 00:00:18.000 --> 00:00:23.840 position:50% align:middle I'm joined remotely by my colleague, Professor Rob Jago from Royal Holloway, 00:00:23.840 --> 00:00:27.330 position:50% align:middle University of London, who I hope is with us. 00:00:27.330 --> 00:00:28.290 position:50% align:middle I think he is. 00:00:28.290 --> 00:00:30.330 position:50% align:middle - [Prof. Jago] I am here, Anna. 00:00:30.330 --> 00:00:31.180 position:50% align:middle Thank you. 00:00:31.180 --> 00:00:32.370 position:50% align:middle - Great. 00:00:32.370 --> 00:00:42.500 position:50% align:middle So Rob's going to be talking in a little bit about some of the qualitative findings from our work together. 00:00:42.500 --> 00:00:45.830 position:50% align:middle So the first thing to say is I'm not a technical person. 00:00:45.830 --> 00:00:55.630 position:50% align:middle So those of you who are computer scientists in the room, you're going to be disappointed because I'm going 00:00:55.630 --> 00:01:02.730 position:50% align:middle to talk very superficially about the actual technical side of this work. 00:01:02.730 --> 00:01:05.970 position:50% align:middle I have a clinical background. 00:01:05.970 --> 00:01:12.010 position:50% align:middle I spent many months in the company with people with really impressive technical skills, 00:01:12.010 --> 00:01:14.380 position:50% align:middle and some of them are listed on this slide. 00:01:14.380 --> 00:01:21.530 position:50% align:middle And I'm sorry you can't see their names because they have been absolutely critical to the success 00:01:21.530 --> 00:01:22.400 position:50% align:middle of this project. 00:01:22.400 --> 00:01:30.830 position:50% align:middle And I also want to thank Maryann in particular from NCSBN, but to the Center for Regulatory Excellence 00:01:30.830 --> 00:01:40.800 position:50% align:middle for taking a leap of faith back in 2018 when AI was hardly in the public discourse, 00:01:40.800 --> 00:01:46.200 position:50% align:middle and certainly not in relation to regulation, and for funding this project. 00:01:46.200 --> 00:01:58.960 position:50% align:middle So what this study was seeking to do was to explore the uses of a particular type of artificial intelligence 00:01:58.960 --> 00:02:03.140 position:50% align:middle in nurse regulation across three jurisdictions. 00:02:03.140 --> 00:02:06.770 position:50% align:middle So here in the U.S., the UK, and Australia. 00:02:06.770 --> 00:02:11.440 position:50% align:middle And it was the first study of its kind in the world, as far as we're aware. 00:02:11.440 --> 00:02:18.100 position:50% align:middle And we were very fortunate to work with the Texas Board of Nursing who, again, 00:02:18.100 --> 00:02:20.000 position:50% align:middle were instrumental in making this happen. 00:02:20.000 --> 00:02:28.040 position:50% align:middle And I, again, want to acknowledge the contribution of Kathy Thomas, Mark Majek, who I know both retired now, 00:02:28.040 --> 00:02:32.050 position:50% align:middle but Dusty Johnson, Skylar Caddell, and Tony Diggs. 00:02:32.050 --> 00:02:36.670 position:50% align:middle So I think Elise McDermott, I think is here from T-Bone [SP]. 00:02:36.670 --> 00:02:44.780 position:50% align:middle So will you please convey my heartfelt thanks to those people because they've been fantastic collaborators 00:02:44.780 --> 00:02:48.320 position:50% align:middle on this project? 00:02:48.320 --> 00:02:55.420 position:50% align:middle So I think when you think about artificial intelligence, you probably think about a whole range 00:02:55.420 --> 00:02:58.750 position:50% align:middle of different technologies. 00:02:58.750 --> 00:03:03.970 position:50% align:middle And just yesterday, we heard about the robo call. 00:03:03.970 --> 00:03:10.400 position:50% align:middle Joe Biden's robo call for the New Hampshire elections. 00:03:10.400 --> 00:03:17.600 position:50% align:middle So, effectively, somebody trying to spread false information using the president's voice. 00:03:17.600 --> 00:03:21.040 position:50% align:middle That's the kind of thing that grabs the headlines. 00:03:21.040 --> 00:03:28.340 position:50% align:middle That's the kind of technology that people are fearful of, and certainly don't want to see anywhere 00:03:28.340 --> 00:03:30.310 position:50% align:middle near their work environments. 00:03:30.310 --> 00:03:38.850 position:50% align:middle If you contrast that with the incredible successes of these kinds of technologies in, for example, 00:03:38.850 --> 00:03:44.500 position:50% align:middle improving the speed and accuracy of cancer diagnosis. 00:03:44.500 --> 00:03:51.910 position:50% align:middle If you talk to some of the radiologists who are at the cutting edge of using AI in their 00:03:51.910 --> 00:03:57.710 position:50% align:middle clinical environments, they are incredibly excited about AI. 00:03:57.710 --> 00:04:06.700 position:50% align:middle So we've got a whole spectrum of interpretations, and constructs, and views about this technology 00:04:06.700 --> 00:04:10.920 position:50% align:middle in between those two extremes. 00:04:10.920 --> 00:04:18.810 position:50% align:middle And I suppose one of the big challenges we face, whether it's in research, or clinical, 00:04:18.810 --> 00:04:28.840 position:50% align:middle or even indeed in commercial settings, is really whether we can actually understand AI as a 00:04:28.840 --> 00:04:35.450 position:50% align:middle force for good, whether it should be regulated, how it should be regulated, 00:04:35.450 --> 00:04:39.390 position:50% align:middle and where it will have most impact. 00:04:39.390 --> 00:04:45.170 position:50% align:middle So I think these are the kinds of questions that are front of mind when people think about technologies. 00:04:45.170 --> 00:04:52.900 position:50% align:middle But there's one thing that's absolutely certain is that this technology will not go back in its box. 00:04:52.900 --> 00:04:56.320 position:50% align:middle It's here to stay. 00:04:56.320 --> 00:05:06.010 position:50% align:middle So in the context of regulation, our challenge was to try and explore how a particular 00:05:06.010 --> 00:05:10.320 position:50% align:middle type of artificial intelligence could be used. 00:05:10.320 --> 00:05:18.910 position:50% align:middle And because Robert and I had been working in nurse disciplinary, the area of nurse discipline, 00:05:18.910 --> 00:05:29.310 position:50% align:middle and looking at the way complaints about nurses are handled, we were particularly excited about the 00:05:29.310 --> 00:05:35.780 position:50% align:middle prospect that these tools could actually help with the speed, and accuracy, 00:05:35.780 --> 00:05:41.250 position:50% align:middle and consistency of decision making in nurse discipline. 00:05:41.250 --> 00:05:44.830 position:50% align:middle And what we know from research from around the world is actually that a very, 00:05:44.830 --> 00:05:49.920 position:50% align:middle very small number of nurses are a high risk to patients. 00:05:49.920 --> 00:05:58.430 position:50% align:middle And that actually, around about 70-plus percent of cases or complaints that are made to regulators 00:05:58.430 --> 00:06:02.910 position:50% align:middle around the world require no regulatory action. 00:06:02.910 --> 00:06:10.110 position:50% align:middle So Robert and I were particularly interested in exploring how these tools could be used with that 00:06:10.110 --> 00:06:26.490 position:50% align:middle particular cohort where there was no harm to patients, and low risk in terms of regulatory assessment. 00:06:26.490 --> 00:06:36.010 position:50% align:middle And the other really important part of this, the thinking behind this work was that we were made 00:06:36.010 --> 00:06:46.000 position:50% align:middle aware of just how scarce a resource regulatory, disciplinary staff in regulation are. 00:06:46.000 --> 00:06:56.330 position:50% align:middle So there's often a high turnover of regulatory staff who work in disciplinary teams because it's very high 00:06:56.330 --> 00:07:01.660 position:50% align:middle stress for relatively little reward. 00:07:01.660 --> 00:07:09.300 position:50% align:middle And the number of complaints are increasing everywhere. 00:07:09.300 --> 00:07:12.040 position:50% align:middle So you've got this rise in the number of complaints. 00:07:12.040 --> 00:07:15.340 position:50% align:middle You've got a high turnover of regulatory staff. 00:07:15.340 --> 00:07:18.850 position:50% align:middle You've got a scarce resource that you want to look after. 00:07:18.850 --> 00:07:29.630 position:50% align:middle And if these tools could be shown to assist in that, then we would...that was essentially what we were 00:07:29.630 --> 00:07:32.680 position:50% align:middle aiming to do. 00:07:32.680 --> 00:07:38.440 position:50% align:middle So our research question was, can this be done? 00:07:38.440 --> 00:07:50.520 position:50% align:middle Our aim was to focus first on whether or not an AI tool could actually calculate the risk level at the early 00:07:50.520 --> 00:07:51.800 position:50% align:middle stage of a complaint. 00:07:51.800 --> 00:07:56.470 position:50% align:middle So when the summary data comes in, the complaint comes in, 00:07:56.470 --> 00:08:05.520 position:50% align:middle can a tool actually assess the data and come up with a risk classification in the way that we as humans do 00:08:05.520 --> 00:08:07.990 position:50% align:middle when we do this work? 00:08:07.990 --> 00:08:18.280 position:50% align:middle Secondly, we wanted to test whether the tool could actually link cases to regulatory rules and standards. 00:08:18.280 --> 00:08:26.420 position:50% align:middle And thirdly, could it detect, find previous similar cases in order for the person, 00:08:26.420 --> 00:08:31.900 position:50% align:middle the case manager who was making the decision about that particular case, and whether to take it forward could 00:08:31.900 --> 00:08:42.740 position:50% align:middle look at previous cases and see what the outcomes of the decisions had been, and add that to their judgment, 00:08:42.740 --> 00:08:49.840 position:50% align:middle their human judgment on the case that was in front of them? 00:08:49.840 --> 00:08:55.980 position:50% align:middle In terms of our methodology, and as I said, I'm not going to go into great technical detail 00:08:55.980 --> 00:09:11.670 position:50% align:middle on this, we were fortunate to be able to access 3000 cases from Texas, and we had about the same proportion, 00:09:11.670 --> 00:09:17.370 position:50% align:middle 1200 or so from the UK and from Australia. 00:09:17.370 --> 00:09:26.920 position:50% align:middle And we use that data to build the tool using a Python web development framework. 00:09:26.920 --> 00:09:34.460 position:50% align:middle And what this tool actually did was try a number of different AI classifiers. 00:09:34.460 --> 00:09:37.400 position:50% align:middle So we use five different AI classifiers. 00:09:37.400 --> 00:09:43.290 position:50% align:middle For those of you who are from that background in the audience, we use gradient boosting, adaptive boosting, 00:09:43.290 --> 00:09:50.350 position:50% align:middle CNN, and an ensemble model, which was a combination of three of the five. 00:09:50.350 --> 00:09:53.740 position:50% align:middle And we fed in the complaint text, so all the information that we had 00:09:53.740 --> 00:09:58.440 position:50% align:middle about that complaint, including the source of the complaint, the risk level, 00:09:58.440 --> 00:10:02.090 position:50% align:middle and the harm to patients if there was any. 00:10:02.090 --> 00:10:09.900 position:50% align:middle And then the classifiers would come up with a risk rating, or a risk classification on each case. 00:10:09.900 --> 00:10:12.780 position:50% align:middle That was the first task, if you like. 00:10:12.780 --> 00:10:19.750 position:50% align:middle And then the second, and we know from research and debate about AI that data 00:10:19.750 --> 00:10:21.190 position:50% align:middle quality is a big issue. 00:10:21.190 --> 00:10:28.790 position:50% align:middle So what you put in to your machine learning tool, your training tool, determines the quality 00:10:28.790 --> 00:10:32.260 position:50% align:middle of information that you're going to get out of it. 00:10:32.260 --> 00:10:40.210 position:50% align:middle So we were keen to check, particularly for human bias on race, gender, and age. 00:10:40.210 --> 00:10:44.760 position:50% align:middle But at this early stage, we didn't have enough data on race and age, 00:10:44.760 --> 00:10:54.540 position:50% align:middle so we focused just on testing whether the tool could actually effectively a de-bias for gender. 00:10:54.540 --> 00:10:58.660 position:50% align:middle So we used three different gender de-biasing techniques. 00:10:58.660 --> 00:11:04.770 position:50% align:middle So removing gender, neutral gender, and a gender swap so that we could make sure that the 00:11:04.770 --> 00:11:10.250 position:50% align:middle risk level that was calculated wasn't biased on gender. 00:11:10.250 --> 00:11:17.170 position:50% align:middle And then the final part was the qualitative testing, which Robert's going to talk about shortly. 00:11:17.170 --> 00:11:22.170 position:50% align:middle We were very keen all the way through this project to involve regulatory staff, 00:11:22.170 --> 00:11:27.760 position:50% align:middle people on the ground who were doing this work, testing out what we were developing, 00:11:27.760 --> 00:11:33.350 position:50% align:middle involving them in the design, and getting their feedback. 00:11:33.350 --> 00:11:40.390 position:50% align:middle So what I'm going to do now is just to show you some screenshots of the prototype that we've developed. 00:11:40.390 --> 00:11:46.880 position:50% align:middle For obvious reasons, I'm not using Texas Board of Nursing data, 00:11:46.880 --> 00:11:48.080 position:50% align:middle which wouldn't be appropriate. 00:11:48.080 --> 00:11:54.300 position:50% align:middle But this is showing accessible data from a U.S. financial regulator, in fact. 00:11:54.300 --> 00:12:00.690 position:50% align:middle And as you can see, the first page is a secure login. 00:12:00.690 --> 00:12:05.870 position:50% align:middle What comes up when you go into the login page is, first of all, the summary. 00:12:05.870 --> 00:12:11.190 position:50% align:middle So you see there in the center, the cases... 00:12:11.190 --> 00:12:15.250 position:50% align:middle I'm so sorry, this is so small, but I know you're going to get these slides afterwards, 00:12:15.250 --> 00:12:16.820 position:50% align:middle so you'll be able to look at a bit more detail. 00:12:16.820 --> 00:12:23.170 position:50% align:middle But if I use the pointer, this is just the headline from each case. 00:12:23.170 --> 00:12:25.140 position:50% align:middle You can see the numbers going down. 00:12:25.140 --> 00:12:35.870 position:50% align:middle And what you have going to the left of the slide is first the probability score, the confidence score. 00:12:35.870 --> 00:12:42.580 position:50% align:middle So that tells you something about how much confidence you can have that the risk level is actually accurate. 00:12:42.580 --> 00:12:44.500 position:50% align:middle So this one says 98%. 00:12:44.500 --> 00:12:50.050 position:50% align:middle And then the final column, which is blank at the moment, 00:12:50.050 --> 00:12:56.400 position:50% align:middle is giving you the space to enter the human judgment. 00:12:56.400 --> 00:13:00.690 position:50% align:middle So right from the start, in the design of the dashboard, 00:13:00.690 --> 00:13:08.140 position:50% align:middle what you have is a tool that's leaving the final decision to the human. 00:13:08.140 --> 00:13:14.280 position:50% align:middle So it's providing information, but it's not making a decision. 00:13:14.280 --> 00:13:16.550 position:50% align:middle That's left to the case manager. 00:13:16.550 --> 00:13:25.410 position:50% align:middle What you see in the bottom right corner is just the kind of graphics on a high medium and low risk. 00:13:25.410 --> 00:13:32.570 position:50% align:middle Again, that will be a calculation on each case. 00:13:32.570 --> 00:13:41.460 position:50% align:middle This is perhaps a little bit clearer for those of you at the front of the room. 00:13:41.460 --> 00:13:53.300 position:50% align:middle So on my right, you see the actual text, the free text, if you like. 00:13:53.300 --> 00:14:01.650 position:50% align:middle And the sense of what you see is the key words that the tool has used to calculate the risk. 00:14:01.650 --> 00:14:12.810 position:50% align:middle And it gives you a sense of which key words are of particular importance in arriving at that risk score. 00:14:12.810 --> 00:14:21.620 position:50% align:middle And then above, as you see, you've got your risk score, your probability, and your confidence score at the top 00:14:21.620 --> 00:14:25.110 position:50% align:middle of the page there. 00:14:25.110 --> 00:14:33.230 position:50% align:middle So that's the risk calculation that the case manager can then use in forming their own judgment 00:14:33.230 --> 00:14:35.170 position:50% align:middle about the case. 00:14:35.170 --> 00:14:40.260 position:50% align:middle What we also developed, and this was very much in collaboration with regulatory 00:14:40.260 --> 00:14:49.640 position:50% align:middle staff who said, "These are things that we would find useful," is we, first of all, 00:14:49.640 --> 00:14:56.640 position:50% align:middle designed the tool so that it could show up the section of the regulatory rules. 00:14:56.640 --> 00:15:03.400 position:50% align:middle And as we learned from working with Texas, there are pages and pages and pages of rules. 00:15:03.400 --> 00:15:10.600 position:50% align:middle So the tool effectively extracts the elements, the section of the rulebook that's relevant 00:15:10.600 --> 00:15:13.130 position:50% align:middle to that particular case. 00:15:13.130 --> 00:15:17.440 position:50% align:middle So that's the second task, if you like, beyond the risk calculation. 00:15:17.440 --> 00:15:25.860 position:50% align:middle And the third is to allow the case manager to compare the case that they're looking at with any previous 00:15:25.860 --> 00:15:30.910 position:50% align:middle cases where a similar pattern of noncompliance has been found. 00:15:30.910 --> 00:15:43.920 position:50% align:middle So it's all about triangulation of data, and not in any sense about replacing human judgment. 00:15:43.920 --> 00:15:51.230 position:50% align:middle And we've been very encouraged by the first phase of testing, the reliability testing here, where, 00:15:51.230 --> 00:15:55.490 position:50% align:middle as I said, we had these five different AI classifiers. 00:15:55.490 --> 00:16:00.870 position:50% align:middle And we compared them to a baseline, and found that in terms of their reliability, 00:16:00.870 --> 00:16:03.590 position:50% align:middle the scores were looking promising. 00:16:03.590 --> 00:16:09.790 position:50% align:middle Not as high as we need them to be, but this is only on a sample of 1241 cases. 00:16:09.790 --> 00:16:15.060 position:50% align:middle So the first cases that we took through the phase one testing. 00:16:15.060 --> 00:16:17.430 position:50% align:middle And now I'm going to hand over to Robert. 00:16:17.430 --> 00:16:21.230 position:50% align:middle I hope to talk to you about the qualitative findings. 00:16:21.230 --> 00:16:22.400 position:50% align:middle - Okay. 00:16:22.400 --> 00:16:23.780 position:50% align:middle Thank you very much, Anna. 00:16:23.780 --> 00:16:25.390 position:50% align:middle And good afternoon, everyone. 00:16:25.390 --> 00:16:29.100 position:50% align:middle And thanks to the conference organizers for allowing me to appear remotely. 00:16:29.100 --> 00:16:35.530 position:50% align:middle So when we look at the general ethical concerns in the AI space, we found ourselves drawn to the seminal work 00:16:35.530 --> 00:16:37.540 position:50% align:middle of Michael Sandel. 00:16:37.540 --> 00:16:42.690 position:50% align:middle And you'll see on the slide that Sandel identified three key ethical concerns. 00:16:42.690 --> 00:16:47.550 position:50% align:middle So the first of these are concerns as to privacy and surveillance. 00:16:47.550 --> 00:16:51.640 position:50% align:middle We then have the second, which is around bias and discrimination. 00:16:51.640 --> 00:16:56.560 position:50% align:middle And then the final ethical concern raised by Sandel is related to the role of human judgments, 00:16:56.560 --> 00:17:01.090 position:50% align:middle and the concerns about replacing human judgment. 00:17:01.090 --> 00:17:04.930 position:50% align:middle Now, in our work, we're also drawn, certainly in the area of health regulation, 00:17:04.930 --> 00:17:07.930 position:50% align:middle specifically to the work of Gabrielle Wolf. 00:17:07.930 --> 00:17:13.000 position:50% align:middle And this has been critical because what they do is explore the ethical implications in the context 00:17:13.000 --> 00:17:17.530 position:50% align:middle of Australia, and health practitioner regulation. 00:17:17.530 --> 00:17:22.380 position:50% align:middle So here the first is dealing with equality before the law. 00:17:22.380 --> 00:17:26.360 position:50% align:middle Then you have transparency and accountability. 00:17:26.360 --> 00:17:31.000 position:50% align:middle The next concern is to do with consistency and predictability. 00:17:31.000 --> 00:17:35.980 position:50% align:middle And then Wolf explores again the right to privacy as, of course, had been considered by Sandel. 00:17:35.980 --> 00:17:41.220 position:50% align:middle And then finally, there's reference to the use of AI, and how it could potentially undermine the 00:17:41.220 --> 00:17:43.710 position:50% align:middle right to work. 00:17:43.710 --> 00:17:48.240 position:50% align:middle Could you move to the next slide, please, Anna? 00:17:48.240 --> 00:17:53.050 position:50% align:middle So in our research, as well as developing and testing the prototype, we also, as Anna suggested, 00:17:53.050 --> 00:17:57.520 position:50% align:middle conducted focus groups with colleagues from our three sites. 00:17:57.520 --> 00:18:02.370 position:50% align:middle And we explored with them what they saw as the benefits and burdens of using AI. 00:18:02.370 --> 00:18:06.470 position:50% align:middle And you'll see on the slide the three main themes that were raised here. 00:18:06.470 --> 00:18:12.520 position:50% align:middle Now, the first of these is negotiating trust and trustworthiness. 00:18:12.520 --> 00:18:19.180 position:50% align:middle So our participants focused much attention on trust, mistrust, and trustworthiness in relation to the 00:18:19.180 --> 00:18:26.770 position:50% align:middle inclusion of AI, and an AI tool in decision making related to complaints in nurse regulation. 00:18:26.770 --> 00:18:34.080 position:50% align:middle Our focus group participants were very aware of the consequences of error in fitness to practice processes. 00:18:34.080 --> 00:18:37.730 position:50% align:middle And as they indicated, flawed decision making could result in registrants 00:18:37.730 --> 00:18:44.880 position:50% align:middle either continuing to harm patients, or, of course, being incorrectly judged to lack fitness to practice. 00:18:44.880 --> 00:18:50.000 position:50% align:middle Either way, the consequence is serious, and remind regulators of their responsibilities 00:18:50.000 --> 00:18:54.820 position:50% align:middle to ensure that decision-making processes are trustworthy. 00:18:54.820 --> 00:19:00.040 position:50% align:middle Within this theme, our participants talked about prioritizing honesty and transparency. 00:19:00.040 --> 00:19:03.820 position:50% align:middle And they also said it was important to think about the language being used, 00:19:03.820 --> 00:19:09.890 position:50% align:middle and how impactful it could be in dealing with issues around AI. 00:19:09.890 --> 00:19:14.630 position:50% align:middle The second of our themes was about affirming fairness and nondiscrimination. 00:19:14.630 --> 00:19:19.330 position:50% align:middle And there was a real concern with this theme that any outputs from AI tools could potentially 00:19:19.330 --> 00:19:25.620 position:50% align:middle incorporate bias, and result in unfair and discriminatory decision making. 00:19:25.620 --> 00:19:28.720 position:50% align:middle There was, therefore, a focus on trying to minimize bias. 00:19:28.720 --> 00:19:36.040 position:50% align:middle There was a need to avoid fabrication where it needed to be understood, which values or elements are focused 00:19:36.040 --> 00:19:40.590 position:50% align:middle in any decision-making tool or algorithm. 00:19:40.590 --> 00:19:43.750 position:50% align:middle And then also, related to this, was the final sub-theme, 00:19:43.750 --> 00:19:46.410 position:50% align:middle which was ensuring accountability. 00:19:46.410 --> 00:19:52.950 position:50% align:middle And our participants explored the objective versus the subjective nature of decision making in this area, 00:19:52.950 --> 00:19:58.520 position:50% align:middle and the importance and challenges of clarity when it comes to accountability. 00:19:58.520 --> 00:20:02.590 position:50% align:middle Now, it was recognized that there would probably be some technical developments which assisted 00:20:02.590 --> 00:20:09.880 position:50% align:middle this process, but they should always remain alert to the potential for discrimination. 00:20:09.880 --> 00:20:14.320 position:50% align:middle And then the final of our themes was about managing burdens and benefits. 00:20:14.320 --> 00:20:19.900 position:50% align:middle And there was a strong awareness in our focus groups that regulatory decision making is complex, 00:20:19.900 --> 00:20:26.530 position:50% align:middle and it needs to take into account context, uncertainty, and ambiguity. 00:20:26.530 --> 00:20:32.070 position:50% align:middle And therefore sub-themes here were talking about the shades of gray with the need for consistency, 00:20:32.070 --> 00:20:36.040 position:50% align:middle but obviously to negotiate complexity as well. 00:20:36.040 --> 00:20:40.400 position:50% align:middle There were concerns of ensuring that nothing fell through the cracks. 00:20:40.400 --> 00:20:46.170 position:50% align:middle And there was a certain caution and optimism regarding what any AI decision support tool could actually 00:20:46.170 --> 00:20:49.890 position:50% align:middle deliver in the context of professional regulation. 00:20:49.890 --> 00:20:53.440 position:50% align:middle And it was critical, we thought, to mention that it was felt there was a real need 00:20:53.440 --> 00:20:57.370 position:50% align:middle for humility as to our expectations of AI. 00:20:57.370 --> 00:21:01.460 position:50% align:middle And then there were some final discussions around effectiveness and burden reduction, 00:21:01.460 --> 00:21:08.180 position:50% align:middle and the potential benefit of any tool vis-a-vis the emotional content of fitness to practice complaints. 00:21:08.180 --> 00:21:08.680 position:50% align:middle Thank you. 00:21:08.680 --> 00:21:12.090 position:50% align:middle And I now hand back to Anna for some concluding thoughts. 00:21:12.090 --> 00:21:14.600 position:50% align:middle - Thanks, Rob. 00:21:14.600 --> 00:21:16.190 position:50% align:middle So lots in there. 00:21:16.190 --> 00:21:18.720 position:50% align:middle And we really have packed in rather lot. 00:21:18.720 --> 00:21:24.470 position:50% align:middle We've published three papers, two in the JNR, and one in the "Journal of Computational Linguistics," 00:21:24.470 --> 00:21:27.950 position:50% align:middle which even I don't understand, but it's there. 00:21:27.950 --> 00:21:32.910 position:50% align:middle So if you're interested, please do read in more depth. 00:21:32.910 --> 00:21:40.220 position:50% align:middle And I think what I want to kind of leave you with, I suppose, is just that sense of how do we get 00:21:40.220 --> 00:21:43.180 position:50% align:middle from data to policy. 00:21:43.180 --> 00:21:51.450 position:50% align:middle How do we do that in this AI-rich environment that we're increasingly living with? 00:21:51.450 --> 00:21:56.710 position:50% align:middle And I think we can't, as regulators, be left behind. 00:21:56.710 --> 00:22:05.330 position:50% align:middle We have to embrace this, but we have to do it in a way that reflects our values, 00:22:05.330 --> 00:22:13.420 position:50% align:middle and our commitment to fairness, to openness, to transparency, and to robust evidence. 00:22:13.420 --> 00:22:19.940 position:50% align:middle So this study is a very first baby step towards that. 00:22:19.940 --> 00:22:27.260 position:50% align:middle And it demonstrates that you can use these multiple techniques of text classification, 00:22:27.260 --> 00:22:32.000 position:50% align:middle semantic similarity measurement, and natural language inference. 00:22:32.000 --> 00:22:42.820 position:50% align:middle You can use these tools to actually provide an outcome which regulatory staff told us was of value to them. 00:22:42.820 --> 00:22:49.940 position:50% align:middle So they came into the work, a lot of them, with great skepticism about whether this was going 00:22:49.940 --> 00:22:52.780 position:50% align:middle to have any benefit for them and their work. 00:22:52.780 --> 00:22:57.810 position:50% align:middle In fact, some of them were very honest about the fact that they were fearful that this was going to have 00:22:57.810 --> 00:23:02.470 position:50% align:middle an impact on them and their livelihoods, their jobs. 00:23:02.470 --> 00:23:10.600 position:50% align:middle And over the course of the two years of working with them, I think they could see, 00:23:10.600 --> 00:23:20.090 position:50% align:middle and they told us that they could see the benefits, as well as the potential pitfalls. 00:23:20.090 --> 00:23:28.820 position:50% align:middle So we've deliberately designed this tool so that it can be used by anybody, any state board who's interested 00:23:28.820 --> 00:23:36.580 position:50% align:middle in replicating this study, who wants to continue with us on this journey. 00:23:36.580 --> 00:23:39.460 position:50% align:middle We have no copyright on the work. 00:23:39.460 --> 00:23:43.130 position:50% align:middle And so we're really keen for NCSBN to take it forward. 00:23:43.130 --> 00:23:51.940 position:50% align:middle And I know from being here today that you have a fantastic team of data scientists who 00:23:51.940 --> 00:23:53.720 position:50% align:middle have the capacity. 00:23:53.720 --> 00:24:00.800 position:50% align:middle And I hope the interest and the willingness to take this work forward. 00:24:00.800 --> 00:24:03.120 position:50% align:middle It certainly isn't going to be me. 00:24:03.120 --> 00:24:09.190 position:50% align:middle It's going to be the data scientists who do this. 00:24:09.190 --> 00:24:20.790 position:50% align:middle But just my very final thought is that transformation comes from people, and not from tools. 00:24:20.790 --> 00:24:27.810 position:50% align:middle And so we need to engage everybody in this debate because there is a lot of fear and anxiety about the 00:24:27.810 --> 00:24:33.970 position:50% align:middle intrusion of these sorts of tools into regulation, as in all areas of our lives. 00:24:33.970 --> 00:24:42.070 position:50% align:middle And I hope that I've just given you, along with Robert, just a glimpse into the potential that this tool has 00:24:42.070 --> 00:24:47.460 position:50% align:middle to improve regulatory decision making, and to bring that consistency. 00:24:47.460 --> 00:24:54.340 position:50% align:middle And crucially, to shift that precious human resource to the high-risk cases, 00:24:54.340 --> 00:25:00.700 position:50% align:middle where we know there's going to be a full investigation, hugely time consuming. 00:25:00.700 --> 00:25:05.380 position:50% align:middle Those high-risk cases, where there are patient safety implications are where 00:25:05.380 --> 00:25:10.760 position:50% align:middle we should be investing our human resource. 00:25:10.760 --> 00:25:20.680 position:50% align:middle And then embrace the new technology to make the screening out of the low-risk cases faster, better, 00:25:20.680 --> 00:25:39.890 position:50% align:middle more consistent without compromising that essential ingredient, which is human judgment. 00:25:39.890 --> 00:25:41.180 position:50% align:middle Thanks very much. 00:25:41.180 --> 00:25:49.808 position:50% align:middle Robert and I would be very happy to take any questions, comments, objections, illuminations from you now. 00:25:52.918 --> 00:25:55.770 position:50% align:middle - [Jose] Good afternoon. 00:25:55.770 --> 00:25:57.600 position:50% align:middle Sorry, that was a little too loud. 00:25:57.600 --> 00:26:00.410 position:50% align:middle Jose Castillo, Florida Board of Nursing. 00:26:00.410 --> 00:26:04.420 position:50% align:middle I love the [inaudible] comment. 00:26:04.420 --> 00:26:13.380 position:50% align:middle It's so enlightening that we are seeing AI being used in a very good light in the regulatory world. 00:26:13.380 --> 00:26:15.900 position:50% align:middle As we all know, I'm also an educator. 00:26:15.900 --> 00:26:16.790 position:50% align:middle Not that you know. 00:26:16.790 --> 00:26:20.240 position:50% align:middle But as we all know in education, I should start there. 00:26:20.240 --> 00:26:23.720 position:50% align:middle AI is either being ostracized or it's being embraced completely. 00:26:23.720 --> 00:26:25.720 position:50% align:middle So it's like there's no middle ground. 00:26:25.720 --> 00:26:33.150 position:50% align:middle But I guess my question is from the regulatory realm, one of the aims is to calculate the risk level 00:26:33.150 --> 00:26:35.520 position:50% align:middle using anonymized data. 00:26:35.520 --> 00:26:40.800 position:50% align:middle And I'm not sure, can you expand on that, like how much of the data? 00:26:40.800 --> 00:26:47.070 position:50% align:middle Because we know as regulators there's a plethora of data when it comes after investigation. 00:26:47.070 --> 00:26:51.870 position:50% align:middle So which ones would be highlighted by AI, or has that been looked at? 00:26:51.870 --> 00:26:58.090 position:50% align:middle How does that come to play with the overall evaluation, so that it will evaluate the risk? 00:26:58.090 --> 00:27:00.000 position:50% align:middle - That's a very good question. - Thank you. 00:27:00.000 --> 00:27:03.500 position:50% align:middle - And quite possibly, my slide was a little misleading. 00:27:03.500 --> 00:27:07.830 position:50% align:middle So the aim in the initial stage, the very first pilot stage, 00:27:07.830 --> 00:27:13.470 position:50% align:middle was to see whether we could actually calculate risks using anonymized data. 00:27:13.470 --> 00:27:20.100 position:50% align:middle So that very first stage was about just seeing whether through the case summaries, 00:27:20.100 --> 00:27:27.390 position:50% align:middle we could come up with a risk score which was comparable to the human judgment without knowing anything more 00:27:27.390 --> 00:27:28.740 position:50% align:middle about the case. 00:27:28.740 --> 00:27:35.020 position:50% align:middle As we went through the project, however, and clearly the Texas Board of Nursing Staff knew 00:27:35.020 --> 00:27:37.200 position:50% align:middle exactly the cases that they were giving us. 00:27:37.200 --> 00:27:45.220 position:50% align:middle They were the ones who were kind of familiar with the profiles, and the outcomes, 00:27:45.220 --> 00:27:47.870 position:50% align:middle and the context for these complaints. 00:27:47.870 --> 00:27:52.840 position:50% align:middle So as we moved through the project, we weren't using anonymized data, 00:27:52.840 --> 00:27:59.710 position:50% align:middle but we were de-identifying the cases, so as to protect people's identities. 00:27:59.710 --> 00:28:02.400 position:50% align:middle So I probably slightly misled you on that. 00:28:02.400 --> 00:28:08.820 position:50% align:middle So the first stage was anonymized data from the financial regulator. 00:28:08.820 --> 00:28:13.920 position:50% align:middle That was a kind of database that our data scientists could gain access to. 00:28:13.920 --> 00:28:15.330 position:50% align:middle It's freely available. 00:28:15.330 --> 00:28:22.880 position:50% align:middle But the next stage was actually a de-identification of cases, and testing to see whether that 00:28:22.880 --> 00:28:25.100 position:50% align:middle risk classification worked. 00:28:25.100 --> 00:28:28.330 position:50% align:middle So it was... 00:28:28.330 --> 00:28:31.310 position:50% align:middle Yeah, I hope that answers your question. 00:28:36.020 --> 00:28:36.630 position:50% align:middle Okay. 00:28:36.630 --> 00:28:39.546 position:50% align:middle Well, thank you so much for your attention. 00:28:39.546 --> 00:28:40.903 position:50% align:middle Thank you.