WEBVTT
1
00:00:04.740 --> 00:00:16.619
Maria Kieferova: Hello everyone thanks for having me, my name is Maria canfora I am a postdoctoral fellow a University of Technology Sydney and i'm also part of Sydney quantum Academy.
2
00:00:17.310 --> 00:00:32.790
Maria Kieferova: And today, I would like to talk about my recent work of quantum machine learning, some of which is already published and some is still in preparation so keep a lookout on our manuscript within the next few weeks.
3
00:00:33.870 --> 00:00:43.920
Maria Kieferova: This work was done in collaboration with Nathan leap, who is at the University of Toronto and Carlos ortiz Moreau, who is at Pacific north.
4
00:00:45.240 --> 00:00:47.730
Maria Kieferova: North Western national lab.
5
00:00:49.920 --> 00:01:08.370
Maria Kieferova: And I will talk about one of the central problems in quantum machine learning, which is quantum barren plateaus and the way how we understand them, and also how one can be able to avoid quantum Baron plateaus and efficiently train quantum neural networks.
6
00:01:10.710 --> 00:01:30.780
Maria Kieferova: So, to give you a taste of what I will talk about I will try to, I will start by explaining why I personally think that quantum machine learning is an exciting area of research and, hopefully, I will manage to persuade you that quantum machine learning is really worked.
7
00:01:32.760 --> 00:01:33.570
Maria Kieferova: getting into.
8
00:01:35.040 --> 00:01:56.880
Maria Kieferova: The more technical part in the more technical part of my talk, I will explain to very popular machine quantum machine learning models and also the problem of vanishing gradient that these models experience and in the last part I will present our proposal for escaping Baron clatters.
9
00:01:58.110 --> 00:02:03.870
Maria Kieferova: So I assume that many of you are already familiar with traditional machine learning.
10
00:02:05.370 --> 00:02:20.100
Maria Kieferova: But for those who are maybe not I like this definition for our to send you know the cells of the machine learning is a field of study that gives computers, the ability to learn, without being explicitly programs.
11
00:02:21.300 --> 00:02:30.810
Maria Kieferova: And this definition is not really strictly for more mathematical but it allows us to extend it into the quantum realm.
12
00:02:31.440 --> 00:02:41.310
Maria Kieferova: So now whenever we are interested in quantum machine learning, we will be looking at quantum computers and how they can learn without explicit with Program.
13
00:02:42.240 --> 00:02:58.230
Maria Kieferova: This is not the definition that everyone in quantum machine learning videos but for the type of quantum machine learning i'm interested in, we will be only looking at programs that run on quantum computers so programs that utilize quantum hardware.
14
00:03:00.570 --> 00:03:06.660
Maria Kieferova: And I know that this is not a conference on quantum machine learning or quantum algorithms.
15
00:03:07.500 --> 00:03:26.520
Maria Kieferova: But there is a very deep know very strong connection between machine learning statistics to people who are already experts have some statistics will hopefully also find some interest in machine learning and how it's related to quantum models.
16
00:03:28.110 --> 00:03:35.040
Maria Kieferova: So the motivation for quantum machine learning usually comes from just put in quantum and machine learning go getter.
17
00:03:35.730 --> 00:03:56.820
Maria Kieferova: And there is a pretty good reason for it, both of these approaches are very, very powerful in quantum computing we have polynomial time algorithms for problems that don't have efficient algorithms classic the business case of such an algorithm is shirts algorithm for factoring.
18
00:03:58.020 --> 00:04:10.770
Maria Kieferova: and similarly machine learning is very useful at solving problems that might be very hard in the worst case scenario, but in many practical cases or in average can be easy.
19
00:04:13.230 --> 00:04:16.110
Maria Kieferova: easy for for machine learning models.
20
00:04:17.190 --> 00:04:27.300
Maria Kieferova: So what happens when we take these two powerful techniques together well, we might come up with something that would be much stronger than either one.
21
00:04:27.630 --> 00:04:40.350
Maria Kieferova: But it may be also possible that these are two things that shouldn't really be mixed, we all know that pasta is amazing ice cream is amazing, but ice cream pastime doesn't sound particularly appetizing.
22
00:04:41.460 --> 00:04:51.060
Maria Kieferova: But with quantum machine learning the story is not really very clear, yet we have some results from statistical learning theory.
23
00:04:51.570 --> 00:05:11.430
Maria Kieferova: Saying that with quantum machine learning, we are able to learn more goals more efficient way in terms of the number of samples that one would need, however, so far, there is no evidence that quantum machine learning can provide useful speed ups for practical problems.
24
00:05:13.920 --> 00:05:23.670
Maria Kieferova: The way how I like to look at quantum machine learning is much more fundamental it's a fundamentally different type of learning.
25
00:05:24.900 --> 00:05:33.360
Maria Kieferova: One way, how we can utilize quantum computers, is that we will take some classical data they could encode them into a quantum state.
26
00:05:34.050 --> 00:05:50.520
Maria Kieferova: And then you use a quantum machine algorithm for what's a generation a generative model ink or classification so from this scenario quantum could possibly offer a moderator class models for classical data.
27
00:05:51.780 --> 00:06:04.050
Maria Kieferova: However, this part would require state preparation that can do this very often a difficult procedure, then might kill all of the speed up that quantum machine learning algorithm would provide.
28
00:06:05.430 --> 00:06:11.970
Maria Kieferova: The more esoteric part of quantum machine learning is when we are learning directly from quantum state.
29
00:06:13.020 --> 00:06:26.490
Maria Kieferova: So you can could assume that we have some states that are coming from quantum experiment, or they are output of a different quantum algorithm and then our quantum machine learning algorithm is learning to.
30
00:06:27.690 --> 00:06:34.710
Maria Kieferova: generate stays there I provide that our advice quite unknown or characterize processes and measurement.
31
00:06:37.020 --> 00:06:50.310
Maria Kieferova: Oh, the type of quantum machine learning that i'm particularly interested looks at generative model, so in this case, we will have some we would in the classical scenario they would have.
32
00:06:50.970 --> 00:07:09.180
Maria Kieferova: several instances of our training data and our goal in the training would like would be to learn and underlying distribution, the date that was originally generated from, and then we would be able to use this distribution to generate new instances.
33
00:07:10.590 --> 00:07:21.240
Maria Kieferova: In the quantum state quantum case we would be given several copies many copies of some unknown state or zero and in the process of training.
34
00:07:21.780 --> 00:07:35.010
Maria Kieferova: Our goal is to find some description of the State girl said this description needs to be classical and then we would be able to use it to generate.
35
00:07:35.820 --> 00:07:50.670
Maria Kieferova: New states a raw deal that that would be relatively close to the original state grow so on this quantum generative model can be also seen as a as a type of approximate cloning.
36
00:07:55.860 --> 00:08:05.130
Maria Kieferova: Being able to learn to learn the like directly from quantum data is a really big ambitious goal.
37
00:08:06.570 --> 00:08:17.100
Maria Kieferova: However, it's also really hard, and this is, and this is we don't really have a lot of evidence, yes, that quantum machine learning can achieve all the speed ups.
38
00:08:18.240 --> 00:08:28.800
Maria Kieferova: One of the reason why there aren't that many results that would be able to demonstrate as strong quantum advantage is that we don't have large quantum computers.
39
00:08:29.910 --> 00:08:39.660
Maria Kieferova: In machine learning a lot of the evidence is empirical we have different algorithms and people run them on large benchmarks in competitions.
40
00:08:40.770 --> 00:08:54.180
Maria Kieferova: And our outcomes of this competitions, the outcomes of these benchmarks tell us what are some of the best machine learning algorithms however quantum hardware, is still in its infancy.
41
00:08:54.810 --> 00:09:07.260
Maria Kieferova: And the problem is that it can solve are typically also very easy for classical computers and it's not very it's very hard to extrapolate from this very small data.
42
00:09:08.220 --> 00:09:20.910
Maria Kieferova: What we can do instead is to use theoretical analysis and we can try to prove fear i'm within can try to think about mathematical models, or perhaps now no gold theorems that would.
43
00:09:22.050 --> 00:09:34.890
Maria Kieferova: Tell us what are some areas that we should be looking at and what are some areas that won't be very fruitful in the future, and this is the type of research and all that i'm interested in.
44
00:09:36.510 --> 00:09:45.150
Maria Kieferova: i'm similar to machine learning in quantum machine learning, we also have a big trade off between expressivity and train ability.
45
00:09:46.380 --> 00:09:54.810
Maria Kieferova: So one of the big promises behind quantum machine learning is that they would be able to use quantum models that would be.
46
00:09:55.440 --> 00:10:09.360
Maria Kieferova: Much more accurate much more accurate, they would have they would be able to explore a larger space or they would be able to model correlations that are difficult to model classically.
47
00:10:10.380 --> 00:10:18.900
Maria Kieferova: And we have several architectures that are very expressive they are as powerful as a general quantum computation or B2B complete.
48
00:10:20.250 --> 00:10:36.120
Maria Kieferova: However, with the the challenge with this very expressive models is in his train ability if models are very, very powerful we don't have techniques that would that we would be able to use to train them.
49
00:10:37.350 --> 00:10:54.360
Maria Kieferova: And this intuition is also there is also a hard theorem between the result from just this year that says that under fairly reasonable conditions models that would be very expressive would be also exponentially difficult to try and.
50
00:10:56.010 --> 00:11:06.090
Maria Kieferova: The way of training most quantum machine learning models follow some type of a gradient descent stochastic gradient descent or.
51
00:11:07.080 --> 00:11:18.690
Maria Kieferova: something along those lines, but every time algorithm would compute a gradient and wherever it is in the optimization landscape and then all of a to go downhill.
52
00:11:19.500 --> 00:11:41.970
Maria Kieferova: However, there are many areas where there is simply no gradient and our algorithm would all get lost because the landscape is flat flat, there is no no direction that would be better than any other direction, and if we experience such a phenomenon we would call it a barren plateau.
53
00:11:45.030 --> 00:11:51.030
Maria Kieferova: So, before I get into the more technical questions do we have any questions so far.
54
00:11:55.050 --> 00:12:03.240
Maria Kieferova: also feel free to ask questions during the talk and just interrupt me and I will try to explain the best way I can.
55
00:12:04.380 --> 00:12:04.800
Maria Kieferova: All right.
56
00:12:05.820 --> 00:12:20.100
Maria Kieferova: And so um Now I will talk about two types of quantum machine learning models and the first one of them is quantum boltzmann machine that takes inspiration from classical bozeman machine.
57
00:12:21.150 --> 00:12:29.160
Maria Kieferova: So bozeman machine is a network that takes the inspiration from physics specifically from the Isaac model.
58
00:12:30.240 --> 00:12:37.680
Maria Kieferova: ultimate mission is defined on the graph with vertices that in recruitment the hidden unit or visible units.
59
00:12:39.570 --> 00:13:00.000
Maria Kieferova: For each edge, we are given a weight and each vertical can also have a bias, which would allow us to assign an energy to every configuration, so you can imagine bozeman machine as a graph with a spin sitting in each vertex that is either pointing up or down.
60
00:13:01.920 --> 00:13:05.970
Maria Kieferova: And then, for each configuration and.
61
00:13:07.080 --> 00:13:17.820
Maria Kieferova: Each assignments of of the vertices and for some ways and biases we can compute that energy and to eat energy we can assign a throne law distribution.
62
00:13:19.170 --> 00:13:19.650
Maria Kieferova: TV.
63
00:13:21.630 --> 00:13:32.130
Maria Kieferova: Then the goal of the training, would be to learn the biases and the weight, such that the terminal distribution will be close to the distribution on training day.
64
00:13:33.300 --> 00:13:41.910
Maria Kieferova: And we will be only looking at the terminal distribution on the visible unit, so in this case the hidden units would serve as extra degrees of freedom.
65
00:13:43.320 --> 00:13:50.010
Maria Kieferova: And the way how we define crossness for both men machines in is in terms of negative likely.
66
00:13:51.240 --> 00:13:53.220
Maria Kieferova: Is in terms of negative likelihood.
67
00:13:55.260 --> 00:14:13.470
Maria Kieferova: bozeman machine has lots of properties that make it very easy to translate into quantum role, instead of binary unit, we can they qubit instead of simple connections, we can assign a local hamiltonian to each pair of cubits or to each give it.
68
00:14:14.730 --> 00:14:24.030
Maria Kieferova: And then, instead of having just simple energy function, we can define and hamiltonian That would be a sum of all the awful hamiltonian.
69
00:14:25.050 --> 00:14:32.460
Maria Kieferova: And lastly, instead of only having probability distributions we can now work with density matrices.
70
00:14:34.680 --> 00:14:36.300
Maria Kieferova: um this.
71
00:14:38.190 --> 00:14:48.930
Maria Kieferova: This extension, very much for all very much follows the intuition, going from a simple Isaac model to transfer pricing model or more complicated spin models.
72
00:14:49.680 --> 00:15:07.800
Maria Kieferova: However, the challenge in quantum bozeman and machine comes in the form of training, because the classical training techniques for minimizing negative word likelihood don't really translate easily because of some technical issues during non communicative.
73
00:15:10.800 --> 00:15:20.910
Maria Kieferova: And the second, third very interesting very popular quantum machine learning model is an analog of a feed for your own network.
74
00:15:22.650 --> 00:15:32.160
Maria Kieferova: Unlike quantum unlike boatman machines feedforward neural network only serve as a loose inspiration for unitarian neural networks.
75
00:15:33.480 --> 00:15:54.750
Maria Kieferova: in the sense that now, instead of having ways and biases will work with some parameters of quantum gates so unitary quantum neural network will be simply a quantum circuit, with many parameters gate and in this Doc I will work with one particular parameter ization.
76
00:15:56.850 --> 00:16:12.090
Maria Kieferova: The goal would be the thing we are now trying to minimize is an object the function that would be defined in terms of the output of the circuit and some operator age that tends to be her mission.
77
00:16:13.230 --> 00:16:26.430
Maria Kieferova: We of course have the ability to define this operator on all the candidates, but very often will only use subset of cubits the same way as in many cases, we will be as visible and hidden units.
78
00:16:29.700 --> 00:16:38.190
Maria Kieferova: visible units will be now the ones that were we evaluate the objective function and every other cubits will correspond to hidden unit.
79
00:16:38.940 --> 00:16:59.880
Maria Kieferova: So a concrete example of a unitary quantum neural network, where we would use other hidden unit is a classifier for a quantum circuit use for classification, we would only measure one cubit at the end binary classification and all the other cubits would we can just trace over.
80
00:17:01.050 --> 00:17:02.460
Maria Kieferova: Then to.
81
00:17:03.750 --> 00:17:13.770
Maria Kieferova: To try and such a quantum neural network, we will try to learn this parametric the and we will try to learn all of this.
82
00:17:14.850 --> 00:17:27.960
Maria Kieferova: And now angles data so today, we will be able to minimize the objective function, and this is again, an idea that is closely related to the variation will bring simple in physics.
83
00:17:29.610 --> 00:17:30.240
Marius Junge: question.
84
00:17:30.630 --> 00:17:34.050
Marius Junge: Yes, so can you go back to your previous picture.
85
00:17:34.650 --> 00:17:35.250
Maria Kieferova: Of course.
86
00:17:35.760 --> 00:17:41.370
Marius Junge: If you if you learn the unitarians would you assume that the connections are given and not changed.
87
00:17:42.750 --> 00:17:44.220
Maria Kieferova: um yes.
88
00:17:45.420 --> 00:17:45.750
Maria Kieferova: yeah.
89
00:17:45.870 --> 00:17:48.000
Marius Junge: Okay Okay, so this configuration.
90
00:17:48.480 --> 00:17:48.930
Marius Junge: So the.
91
00:17:49.410 --> 00:17:59.640
Marius Junge: depth and the circuit to sort of clear, but the unitarians are not that because that can change block right, I mean if you change these connections this changes outcomes dramatically right.
92
00:17:59.910 --> 00:18:09.750
Maria Kieferova: um yes so so we start with some circuit, where you will decide where to place the gate so you will trade that as a hyper parameter and.
93
00:18:10.800 --> 00:18:11.820
Marius Junge: corresponds to.
94
00:18:11.880 --> 00:18:16.680
Marius Junge: Changing local interaction on these are using mobile type configurations.
95
00:18:16.920 --> 00:18:25.740
Maria Kieferova: um, this is not really the same as an icing mobile, because now everything, everything is you need to go okay.
96
00:18:26.040 --> 00:18:26.700
Marius Junge: But but.
97
00:18:26.850 --> 00:18:31.110
Marius Junge: You could imagine that you have unitary is for every edge and your picture right.
98
00:18:31.860 --> 00:18:32.760
Maria Kieferova: Yes, yeah.
99
00:18:32.850 --> 00:18:40.350
Marius Junge: And then, then, then you would the connection that the graph determines which connections, you have, and then you just change the unitarian that every edge.
100
00:18:40.800 --> 00:18:42.030
Maria Kieferova: yeah so yeah you can.
101
00:18:42.030 --> 00:18:42.930
Marius Junge: Believe here.
102
00:18:43.530 --> 00:18:47.790
Maria Kieferova: yeah so this would be similar to like changing the interactions on every.
103
00:18:49.110 --> 00:18:49.980
Marius Junge: Okay, good Thank you.
104
00:18:51.150 --> 00:18:52.170
Maria Kieferova: Yes, um.
105
00:18:52.290 --> 00:18:54.270
Maria Kieferova: yeah, of course, you know you can always.
106
00:18:54.270 --> 00:19:05.400
Maria Kieferova: decide to know some interactions wouldn't really be there and we can set one of the angles to zero, that would effectively and lead to a completely deleting one of the gates.
107
00:19:06.720 --> 00:19:09.810
Maria Kieferova: Putting new gates in would be quite difficult.
108
00:19:11.040 --> 00:19:21.870
Maria Kieferova: All right, Okay, so how would we try and quantum bozeman machines and feedforward unitary quantum neural networks, and so, as I already said, we will.
109
00:19:22.410 --> 00:19:36.960
Maria Kieferova: base our approach is on gradient descent, but many of the lot of the intuition and that I will talk about with also translate to gradient free methods or techniques using utilizing.
110
00:19:38.580 --> 00:19:39.150
Maria Kieferova: SEM.
111
00:19:40.410 --> 00:19:52.830
Maria Kieferova: So for gradient descent with first would need to estimate gradient every time, and you have the greatest and have to be efficient way assume it.
112
00:19:54.060 --> 00:20:05.610
Maria Kieferova: And the second part, that is important to us is convergence, the number of steps we can we will take to find it no a good solution must be animals polynomial.
113
00:20:07.230 --> 00:20:21.840
Maria Kieferova: um However, there are many results that say that this trade performing such training efficiently is not possible, in other words, they are Baron plateaus in many quantum neural networks.
114
00:20:23.070 --> 00:20:30.240
Maria Kieferova: First, some surgeries out came from Jared mclean and a team in in Google that show that.
115
00:20:31.350 --> 00:20:39.270
Maria Kieferova: For quantity for quantum circuits unitary quantum neural networks that are that are sufficient for deep.
116
00:20:40.950 --> 00:20:52.440
Maria Kieferova: The gradient the variance over the gradient gradients and the size of the gradients will be decreasing exponentially with the number of cubits.
117
00:20:53.520 --> 00:20:56.040
Maria Kieferova: and other similar result, who came from.
118
00:20:57.570 --> 00:21:07.620
Maria Kieferova: Marco service off and they show that for also for many circuits that are shallow but for party court costs functions and.
119
00:21:09.090 --> 00:21:13.500
Maria Kieferova: Objective functions shouldn't be will be very flat for most.
120
00:21:14.610 --> 00:21:20.100
Maria Kieferova: Most parameters and we wouldn't be able to to learn in many scenario.
121
00:21:21.810 --> 00:21:29.580
Maria Kieferova: And also important thing is to point out that these quantum algorithms for quantum machine learning start at the Baron plateau.
122
00:21:30.060 --> 00:21:38.520
Maria Kieferova: In traditional mention learning they're also results talking about vanishing gradient, but these typically or typically.
123
00:21:39.300 --> 00:21:56.460
Maria Kieferova: occur much later in the training and, like in quantum algorithms when we start, we want to start learning we don't know anything we have a circuit, or we have a model that is absolutely terrible for the task we want to achieve, and we are not able to make it any better.
124
00:21:59.460 --> 00:22:10.590
Maria Kieferova: So let me know tell you about my result that shows about how will hidden units affect the size of the gradient.
125
00:22:11.640 --> 00:22:21.630
Maria Kieferova: So in book in both cases for quantum circuit and for quantum balls manned missions, we will have visibly in it's hidden units and.
126
00:22:23.520 --> 00:22:29.640
Maria Kieferova: Typically, we will introduce them entanglement for them between them so for.
127
00:22:30.720 --> 00:22:53.070
Maria Kieferova: Iran during the author circuits, there will be almost always our cubits will become entangled and also for most Hamilton the ends there will be some entanglement between the two parts of the system subsystem Now the question is, how will the time women affect the size of our gradients.
128
00:22:54.480 --> 00:23:04.410
Maria Kieferova: On what did we showed in our circuit is that if we assume that the circuit implement roughly random unit right.
129
00:23:05.670 --> 00:23:10.740
Maria Kieferova: Then the visible and he didn't it will be very close to maximum entangled.
130
00:23:12.030 --> 00:23:31.980
Maria Kieferova: However, we can really implement random you know trees, because random quantum states and random militaries cannot be produced efficient way, we will need to have extremely deep exponentially deep quantum circuit to really sample from the unit to regroup and grander.
131
00:23:33.060 --> 00:23:45.900
Maria Kieferova: Instead, we will only assume that our circuits approximates the designs, which means that the mean and variance for the designs will be the same as for true randomness.
132
00:23:46.950 --> 00:24:00.180
Maria Kieferova: But quantum designs can be prepared with going on with circles of polynomial that so this means that with quantum to the designs are result is much more technical.
133
00:24:01.230 --> 00:24:06.420
Maria Kieferova: But it is an assumption that holds true for randomly initialize.
134
00:24:09.720 --> 00:24:13.050
Maria Kieferova: randomly initialize quantum circuit of reasonable back.
135
00:24:15.180 --> 00:24:24.780
Maria Kieferova: So let me explain the intuition, in a very simple example we will now have a larger system of cubits and we will only measure one subsystem.
136
00:24:27.630 --> 00:24:31.980
Maria Kieferova: In the simplest scenario our full system real.
137
00:24:33.030 --> 00:24:42.750
Maria Kieferova: corresponds from to subsystem each and each of them will have only one cubit and let's say that the entire the system.
138
00:24:43.200 --> 00:24:55.200
Maria Kieferova: The the stain on the entire system is maximal entangled between the V subsystem and H system system, so now if we only look at all.
139
00:24:55.770 --> 00:25:09.450
Maria Kieferova: The be subsystem it will look as a maximum limits today, it is because all of the information stored in our system was not work was non local because of the entanglement.
140
00:25:10.740 --> 00:25:30.570
Maria Kieferova: So even if we start from a pure state if the state is very entangled by measuring only part of the solar system we won't be able to learn anything about the entire entire system, the subsystem works completely random and it doesn't give us any information about what is.
141
00:25:31.740 --> 00:25:49.290
Maria Kieferova: happening in the system at large, and it turns out that this very, very simple intuition coming from belt State applies much is actually very typical for states that are reasonably random.
142
00:25:50.850 --> 00:26:09.660
Maria Kieferova: This result comes from andros Venter who gave a talk yesterday and well, a few years ago undress and his collaborator showed that in we randomly choose a state on the large on the water system, and we look at the.
143
00:26:11.370 --> 00:26:29.760
Maria Kieferova: State only one subsystem and the distance between the state on the subsystem and the maximal it really makes state, then the difference between maximally mixtape and the state on the subsystem bill will decrease.
144
00:26:30.930 --> 00:26:35.400
Maria Kieferova: will be exponentially decreasing, so, in other words the.
145
00:26:36.450 --> 00:26:44.730
Maria Kieferova: Reuse density matrix will be almost indistinguishable from a maximum and we go from a maximum exposure.
146
00:26:46.320 --> 00:26:51.570
Maria Kieferova: In other words, we know that most days are very entangled.
147
00:26:52.770 --> 00:26:54.720
Maria Kieferova: To entangled for our scenario.
148
00:26:56.340 --> 00:27:03.420
Maria Kieferova: So for our quantum neural networks, it means that we will work with some initialization and.
149
00:27:04.440 --> 00:27:25.590
Maria Kieferova: Then we will perform all the measurements on all of our visible in it and for all, typical cases will we will be measuring will be purely random added one be really sensitive to the circuit that we perform or the deviation in individual angles.
150
00:27:26.760 --> 00:27:27.180
Maria Kieferova: Of.
151
00:27:29.610 --> 00:27:46.770
Maria Kieferova: The more formally we can we can then show that the gradients will be vanishing exponentially in the number of hidden unit under the assumptions that we are looking at quantum neural network with many more hidden unit, then visible unit.
152
00:27:48.450 --> 00:28:04.890
Maria Kieferova: For those more mathematically inclined here what we were in fact looking at the leaf it's constant for this scenario, and we also needed to assume that the permission operator that we measured was bounded.
153
00:28:12.210 --> 00:28:14.460
Maria Kieferova: We did majority of our work.
154
00:28:15.690 --> 00:28:30.060
Maria Kieferova: Theoretically, but we also around numerical experiments that show that if we're looking at the trends distance between the reuse density matrix of our models and maximal any state.
155
00:28:30.600 --> 00:28:51.090
Maria Kieferova: As the number of hidden unit increases our use maximum element, the our reuse density matrix will be getting closer and closer to the maximum extent and this holds true bought for unitary quantum neural networks and quantum holtman machine.
156
00:28:52.290 --> 00:29:06.870
Maria Kieferova: I forgot to mention this, the this scaling it was only for unique rick want me on it works, but we were also able to achieve a very similar result for quantum Osman machines.
157
00:29:07.650 --> 00:29:17.730
Maria Kieferova: It was just a little bit more technical and similarly at when we look at how the variance of the gradients and the gradients.
158
00:29:18.420 --> 00:29:29.160
Maria Kieferova: Changes with the number of hidden is even for very, very small quantum your own networks, we observed a rapid DK of the grading.
159
00:29:29.760 --> 00:29:43.560
Maria Kieferova: So, only for only with even with very small quantum your own network, we are not able to learn almost anything if we are using all the hidden units that we are late or tracing over.
160
00:29:46.080 --> 00:29:47.310
Maria Kieferova: This is perhaps a.
161
00:29:49.380 --> 00:29:57.450
Maria Kieferova: Slightly This might be a very confusing observation, because the story, people often tell about quantum machine learning.
162
00:29:57.870 --> 00:30:13.500
Maria Kieferova: And quantum computing in general is that entanglement is just a good thing, the more because of entanglement we are getting all of this amazing power in quantum computers and more entanglement seen this slide that it would be always always a positive thing.
163
00:30:14.880 --> 00:30:15.690
Maria Kieferova: But this is not.
164
00:30:16.770 --> 00:30:20.970
Maria Kieferova: strictly true so we know that a we are working.
165
00:30:22.020 --> 00:30:26.610
Maria Kieferova: Only with products date, and we have not entanglement whatsoever in our quantum.
166
00:30:27.810 --> 00:30:32.670
Maria Kieferova: quantum circuits, we would be able to simulate this computation classically.
167
00:30:34.440 --> 00:30:35.160
Maria Kieferova: However.
168
00:30:37.680 --> 00:30:38.310
Maria Kieferova: It will we.
169
00:30:40.230 --> 00:30:49.110
Maria Kieferova: As we show in this work if we have too much entanglement we the system, the cases again simulated classically.
170
00:30:49.680 --> 00:31:06.780
Maria Kieferova: And because too much entanglement leads to concentration of majors that just turns our subsistence to be completely random, so we will need to find some middle or out in the end amount of entanglement when designing.
171
00:31:07.830 --> 00:31:24.630
Maria Kieferova: quantum neural networks not too much not too late, and this is not a complete when you type of resolve because several years ago people who looked into measurement based quantum computation also observed.
172
00:31:26.640 --> 00:31:29.070
Maria Kieferova: As several similar dichotomy.
173
00:31:35.580 --> 00:31:52.890
Maria Kieferova: So what would be some time what would be some type of systems that would have are a good amount of entanglement not too much not too late for all the time, these people are have been looking at quantifying entanglement for a while now.
174
00:31:54.180 --> 00:32:02.070
Maria Kieferova: And we show that, if the entanglement between hidden indivisible unit for for those so called area law.
175
00:32:03.510 --> 00:32:16.140
Maria Kieferova: We wouldn't have we might have a chance to learn the gradient However, if the entanglement scales is volume all, which is much more typical, then we would observe.
176
00:32:17.190 --> 00:32:21.090
Maria Kieferova: We would always observe a quantum bear and logical.
177
00:32:23.070 --> 00:32:28.500
Maria Kieferova: This is a very bleak scenario This basically says that for.
178
00:32:30.540 --> 00:32:45.030
Maria Kieferova: Many reasonable quantum neural networks, we are they won't be useful, because we don't have a way to train them because in training, we are not really boring anything we are not moving anywhere.
179
00:32:46.110 --> 00:32:58.170
Maria Kieferova: So the natural question is to ask is there any way how we could escape from a barren plateau and the work that i'm trying frantically trying to finish this week and next week.
180
00:32:59.700 --> 00:33:09.390
Maria Kieferova: says that, yes, there might be an assumption behind quantum barren plateau results that can be evaluated in many cases.
181
00:33:11.340 --> 00:33:14.580
Maria Kieferova: And the assumption is about the objective function.
182
00:33:15.660 --> 00:33:30.990
Maria Kieferova: So recall that the only thing that was relevant for the quantum Baron Plato was the norm, of the objective function, we assumed that the operators that we are with it, we are measuring our bounded.
183
00:33:32.220 --> 00:33:41.370
Maria Kieferova: By a constant or the the infinity norm, of the operators with figure in the expression for for delicious constant.
184
00:33:42.480 --> 00:33:46.320
Maria Kieferova: And this is very good physically motivated.
185
00:33:48.270 --> 00:33:49.560
Maria Kieferova: motivated to resolve.
186
00:33:50.670 --> 00:33:55.950
Maria Kieferova: Man directly measuring unbounded object the functions it's not very practical.
187
00:33:57.090 --> 00:34:14.310
Maria Kieferova: However, our idea is that we would still choose and under bounded objective function that has simple gradients and they're in traditional machine learning, this is very common and for all the training people are using kale.
188
00:34:14.310 --> 00:34:15.120
Aditya Ramamoorthy: divergence.
189
00:34:15.360 --> 00:34:16.530
Maria Kieferova: Which is unbounded.
190
00:34:17.070 --> 00:34:22.830
Maria Kieferova: A kale divergence between states that are orthogonal diverges to infinity.
191
00:34:23.940 --> 00:34:43.710
Maria Kieferova: And the way how we are able to train using unbounded objective functions is if we measured gradients directly and we never try to estimate the unbounded functions, so we don't really know how close we are to the good solution we just want to know that we are always going down here.
192
00:34:46.710 --> 00:35:11.250
Maria Kieferova: There are many divergences that one could try to apply more similar to kl divergence, but in our paper we argue that maximum quantum rainy divergence, or sometimes all known as special divergence and would be a good objective functions for a range of quantum machine learning models.
193
00:35:13.350 --> 00:35:19.740
Maria Kieferova: We are, in particular, looking at this special case of alpha equals two because.
194
00:35:20.760 --> 00:35:40.710
Maria Kieferova: This type of a rainy divergence upper bounds the kale divergence, so it is a good upper bound on the quantity that people in traditional machine learning are interested in, but it also has a very simple form that allows us to estimate the gradient.
195
00:35:42.060 --> 00:35:52.320
Maria Kieferova: And in our work, we propose that these gradients can be estimated by sampling and sometimes we estimate the symbol or divergence.
196
00:35:52.710 --> 00:36:15.210
Maria Kieferova: But, in many cases, it might be more practical to estimate the reverse divergence, and these two quantities are, of course, not the same divergences are not symmetric but in the case when both row and Sigma bought our model and data for rank um he is not.
197
00:36:17.280 --> 00:36:21.390
Maria Kieferova: bought divergences would lead to the correct solution.
198
00:36:22.560 --> 00:36:23.010
Oh.
199
00:36:25.200 --> 00:36:36.720
Maria Kieferova: Oh, we also we applied our APP route for learning terminal state, which are very, very interesting application so when we talked to a physicist.
200
00:36:37.260 --> 00:36:54.120
Maria Kieferova: Francis who's will generally tell you that thermal states are the easiest thing ever everything in our lab stimuli, so all the documents, however, we know from computer science, that the computational problem behind normalization is.
201
00:36:55.140 --> 00:37:14.310
Maria Kieferova: sharpie hard or cure made hard depending on depending on exactly how we are formulating the problem so learning thermostat is something that can be extremely hard in some scenarios, but in many cases, it is also very, very easy.
202
00:37:17.460 --> 00:37:17.910
Maria Kieferova: So.
203
00:37:19.890 --> 00:37:32.520
Maria Kieferova: By following maximal rainy divergence we show that for small quantum neural network, we are in fact able to learn Kwan we are able to earn Kwan to.
204
00:37:33.000 --> 00:37:49.830
Maria Kieferova: The term of states with remarkable precedent so on the first block, you can see how we are putting the loss function on the y axis and as it changes with the number of ebooks.
205
00:37:51.000 --> 00:38:10.890
Maria Kieferova: And as we increase the number of hidden here is our loss function degree decrease significantly, and this is excellent news because more hidden unit should be should help our machine quantum machine learning models, this is the intuition.
206
00:38:11.970 --> 00:38:18.150
Maria Kieferova: That we have from traditional machine learning, but the numerical data indicates that.
207
00:38:20.430 --> 00:38:32.190
Maria Kieferova: indicates that really, we are not experiencing an entanglement in US barren plateau the way we, the one that we showed earlier during this talk.
208
00:38:33.060 --> 00:38:47.640
Maria Kieferova: And on the right hand side, we are also looking at the fidelity between our model in our data and fidelity is a proper measure of distance so in both scenarios so in both both both stars.
209
00:38:49.950 --> 00:39:06.300
Maria Kieferova: That, in this case, our model is really learning and there are at least in the small cases there we are not getting stuck in a barren plateau a you know part of the training.
210
00:39:07.890 --> 00:39:28.860
Maria Kieferova: um I don't really want to get into more technical difficulties in the way how we are using the sampling algorithms or the case or some of more technical details about when what would be some of the conditions for invoicing band plateaus.
211
00:39:30.180 --> 00:39:35.430
Maria Kieferova: And we have this and we partially answered some of these questions in.
212
00:39:37.200 --> 00:39:52.290
Maria Kieferova: In our paper, but there are still many questions that remain open so first question would be if there are still some other other types of burn plateaus that people haven't discovered yet.
213
00:39:53.040 --> 00:40:03.090
Maria Kieferova: We know now about seven or or scenarios about how not to try and quantum in your networks, but there might be several other traps that we are not yet aware of.
214
00:40:05.010 --> 00:40:19.860
Maria Kieferova: Similar question is when can we compete and we say that the gradient is efficient, we can be available and also bounded away from zero and in what cases, we would be able to say anything about convergence.
215
00:40:24.360 --> 00:40:38.790
Maria Kieferova: and other questions that remains open, is, is there a way to characterize states that quantum computers can and cannot learn and, lastly, is there any noise tolerance in our quantum.
216
00:40:39.600 --> 00:40:57.120
Maria Kieferova: machine learning algorithm the algorithms that I think most often Sir algorithms for fault tolerant quantum computers, these are major quantum computers, with many cubits and for error correction, but it will take quite a few years until.
217
00:40:58.230 --> 00:41:09.300
Maria Kieferova: Experimental physicists and engineers managed to build these fault tolerant quantum computers and it would be amazing if we came up with.
218
00:41:10.170 --> 00:41:28.980
Maria Kieferova: quantum algorithms that can tolerate some noise and there is a possibility that quantum machine learning could potentially help us to could potentially provide some of this more noise dollar and quantum algorithms.
219
00:41:30.240 --> 00:41:42.780
Maria Kieferova: So, to conclude I briefly talk about some of the big problems or challenges in quantum machine learning and the I see these problems as.
220
00:41:44.190 --> 00:42:07.380
Maria Kieferova: As opportunities to do some interesting interesting research with beautiful math and difficult computer science problems, then I introduce to you quantum bozeman machines and unitary quantum neural networks and show that both of them can experience vanishing gradient if.
221
00:42:08.550 --> 00:42:23.730
Maria Kieferova: Our quantum machine learning models use a large number of hidden unit and, lastly, I talked about escaping burn plateaus by using an unbounded.
222
00:42:24.840 --> 00:42:25.770
Maria Kieferova: objective function.
223
00:42:27.060 --> 00:42:44.550
Maria Kieferova: So just to summarize for those who were too busy replying to emails or otherwise preoccupied the one the one of the central questions in quantum machine learning Easter inability, what are the models that we can train and how do we train them.
224
00:42:46.020 --> 00:42:54.750
Maria Kieferova: And once we are coming up with these models, we really need to construct them carefully to every parent plateaus and.
225
00:42:56.550 --> 00:43:04.290
Maria Kieferova: As now I, as I was asked earlier during the questions coming up with well constructed quantum circuits.
226
00:43:05.250 --> 00:43:22.200
Maria Kieferova: Now that wouldn't be good for quantum matching algorithms is not trivial because many of them would exhibit Baron plateaus and therefore they wouldn't be really useful, so thank you very much for your attention and i'm happy to answer any questions.
227
00:43:32.040 --> 00:43:34.320
Aditya Ramamoorthy: thanks for the Nice Doc can you.
228
00:43:35.580 --> 00:43:35.970
Aditya Ramamoorthy: hear me.
229
00:43:37.200 --> 00:43:37.980
Maria Kieferova: Yes, I can hear you.
230
00:43:38.160 --> 00:43:47.370
Aditya Ramamoorthy: Oh yeah so I have a question, so you mentioned at the end, that you are picking an objective function, which is this the quantum relative entropy that's unbounded.
231
00:43:48.270 --> 00:43:59.310
Aditya Ramamoorthy: But, based on what you're saying the reduced density matrix will have full support right so in practice this loss function that you're picking.
232
00:43:59.970 --> 00:44:16.140
Aditya Ramamoorthy: Does it actually you know blow to infinity because you know the reduced density matrix based on your reserves will have full support and the Cross entropy that your computing will also typically not blow to infinity mean is, I don't know i'm just asking so.
233
00:44:17.250 --> 00:44:27.570
Maria Kieferova: um yeah so if these days are orthogonal so basically in practice when we start from states that are very far from each other they're completely different than we would have.
234
00:44:29.400 --> 00:44:41.250
Maria Kieferova: No, no, in the the limit of both the old dimensions going to infinity our objective, our divergence between them will be also growing through infinity.
235
00:44:42.600 --> 00:44:43.740
Aditya Ramamoorthy: In what dimension sorry.
236
00:44:44.520 --> 00:44:46.560
Maria Kieferova: i'm in the dimensions of that.
237
00:44:48.240 --> 00:44:48.960
Aditya Ramamoorthy: So in the number of.
238
00:44:49.770 --> 00:45:04.680
Aditya Ramamoorthy: memorabilia OK, but so for most of these sort of results that you presented at the end, how many cubits are there, I mean so like yeah so yeah that makes sense, so if the number of cubits close to infinity then that's also will be growing the divergence but.
239
00:45:06.390 --> 00:45:17.580
Aditya Ramamoorthy: Is that the kind of scaling you need for to circumvent the original argument that you made that Lambda needs to be that order of you know, the agent the infinity normal age time something.
240
00:45:18.720 --> 00:45:19.500
Maria Kieferova: So, so in our.
241
00:45:19.530 --> 00:45:21.090
Maria Kieferova: In the theoretical analysis.
242
00:45:21.330 --> 00:45:32.220
Maria Kieferova: We are looking at now all as integrations and we are showing the now as simplistically by using this objective function, we would be breaking the assumptions and then.
243
00:45:32.910 --> 00:45:49.080
Maria Kieferova: My colleague was looking at the case of what will be know some of the conditions we can put on our training data and and the model to ensure that we want one experience in the Baron plateaus I think the recording.
244
00:45:50.700 --> 00:46:10.470
Maria Kieferova: And in our numerical co study we're also showing that, even for a very small quantum neural networks with no just a handful of cubits a we are using the knife training, we will run into a barren plateau pretty much with you know four or five gigabits but.
245
00:46:10.770 --> 00:46:15.870
Aditya Ramamoorthy: What is the objective function that when you say the naive training, what is the objective function that you use there.
246
00:46:16.710 --> 00:46:18.030
Maria Kieferova: So we would.
247
00:46:19.890 --> 00:46:22.860
Maria Kieferova: Basically, tried to minimize an energy, so we will.
248
00:46:24.090 --> 00:46:28.050
Aditya Ramamoorthy: minimize the energy of some you know some permission matrix for which you're measuring.
249
00:46:28.740 --> 00:46:37.320
Maria Kieferova: yeah or you know you can have you can have either a permission matrix or you can look at their projector on an unknown state.
250
00:46:37.920 --> 00:46:38.160
Okay.
251
00:46:39.900 --> 00:46:40.410
Thank you.
252
00:46:43.410 --> 00:46:53.190
Marius Junge: I have one more question, so I was looking at hamiltonian coming from generators have linked lobbying on, for instance on a graph.
253
00:46:53.640 --> 00:47:04.500
Marius Junge: And there we have a lot of estimates for pale divergence, as a matter of time right so so I mean we know that this explanation that the chaos divergent even was entanglement decrease exponentially.
254
00:47:05.460 --> 00:47:13.320
Marius Junge: So I was just wanted to know whether this kale divergence the gradient has been already done, I mean have people investigated.
255
00:47:13.890 --> 00:47:29.130
Marius Junge: The because the gradient in that case is connected to the what is some version of quantum Fisher information, not the one in the Trojan he but, and that is very that's a very interesting quantity, so I just wanted to know whether that analysis has been done for in this particular case.
256
00:47:29.460 --> 00:47:43.020
Maria Kieferova: um Okay, so let me explain, let me try to understand your question a little bit better so you're looking at some you know some spin system or something on the graph and you're letting the system to evolve and you're looking good.
257
00:47:43.350 --> 00:47:47.970
Marius Junge: Right so we're, starting with the hamiltonian This is also generator some time.
258
00:47:49.140 --> 00:47:50.040
Marius Junge: And then, then.
259
00:47:50.160 --> 00:47:55.770
Marius Junge: Then, then we can look at you know how do these states prefer Monday time evolution that's a very natural question.
260
00:47:57.210 --> 00:48:10.680
Marius Junge: And these time evolution they of course realized by a unitary is right, which will principle we could write them as local unitary thunder sound assumption so so so when could actually try to guess which lean body and which.
261
00:48:12.120 --> 00:48:17.490
Marius Junge: which he is behind the picture right So if you want to learn the.
262
00:48:18.870 --> 00:48:33.330
Marius Junge: The hamiltonian right and we could start with different we would probably know how the unitarians have to be composed in this picture, he talked about we just don't know which local unitarians are you so I don't know whether that's has been analyzed but it put.
263
00:48:34.800 --> 00:48:38.130
Marius Junge: Family of hamiltonian, you have the same.
264
00:48:39.360 --> 00:48:50.670
Marius Junge: You know same approximation with shallow circuits and I just wanted, and then the objective function is to learn kl divergence directly, because that is in many cases, the one which we understand that.
265
00:48:51.540 --> 00:48:52.680
Marius Junge: i'm not.
266
00:48:53.580 --> 00:49:03.030
Maria Kieferova: So one thing that people are looking at is hamiltonian learning so you have some dynamics and you know that it's coming from some family of hamiltonian.
267
00:49:04.050 --> 00:49:09.420
Maria Kieferova: Why names and then what type of handled on the on it would be by.
268
00:49:10.440 --> 00:49:14.160
Maria Kieferova: So, yes that's that's something that people are looking at.
269
00:49:15.030 --> 00:49:19.560
Marius Junge: But it just so happens and that's why this is actually the second part of my question, so in this.
270
00:49:19.860 --> 00:49:30.390
Marius Junge: picture which i'm looking at right, I mean particular hamiltonian is coming from the lean body, and I mean it's just a very particular but very well studied, I mean it's a very well known get.
271
00:49:31.140 --> 00:49:37.320
Marius Junge: You would only just double the size of entanglement you would never go beyond it's not necessary right because.
272
00:49:37.830 --> 00:49:54.120
Marius Junge: For the kale divergence, you would not you would just double the size of the hidden variable because more you would lose right so it's pretty clear that the size of the system, you want to analyze this you need to exactly double the amount was unknown variable.
273
00:49:54.900 --> 00:49:59.820
Marius Junge: Unless you, you want to write down the gates in a more complicated way you need I mean.
274
00:50:00.960 --> 00:50:04.020
Marius Junge: But that's exactly what you need to so this.
275
00:50:04.200 --> 00:50:05.460
Marius Junge: may be an explanation why.
276
00:50:05.820 --> 00:50:08.550
Marius Junge: Too much i'm paying them and then you're learning in the wrong class.
277
00:50:09.150 --> 00:50:12.060
Maria Kieferova: yeah yeah you are looking at it, yes, you know.
278
00:50:12.180 --> 00:50:24.690
Maria Kieferova: Well, so it sort of always depends, what is the prior information you have when you're learning so if you know if you know already a lot about your system, you can design the circuit.
279
00:50:25.980 --> 00:50:36.150
Maria Kieferova: You can design the circuit much better you already know, like roughly how many extra cubits you need you know what type of pharma tries to get you need, but now in many.
280
00:50:36.690 --> 00:50:50.670
Maria Kieferova: And for these specialized scenarios, people are coming with better and better learning algorithms, but if you don't have a lot of prior knowledge and you just start somewhere random and you hope for the best and the best will never happen.
281
00:50:51.810 --> 00:50:55.020
Marius Junge: yeah but then you're on the plateau right and.
282
00:50:55.290 --> 00:50:56.010
Marius Junge: yeah exactly.
283
00:50:57.270 --> 00:51:00.990
Marius Junge: you're not expecting that they are you have to have some okay that makes.
284
00:51:01.440 --> 00:51:02.310
Maria Kieferova: You will yeah yeah.
285
00:51:02.460 --> 00:51:06.600
Maria Kieferova: I mean now you would be surprised how often people with expected it would just work.
286
00:51:08.610 --> 00:51:11.160
Maria Kieferova: But yeah I use, I have an intuition here.
287
00:51:11.640 --> 00:51:18.360
Marius Junge: And just from a technical point, so this is kl diversions as an objective function directly has been studied or not.
288
00:51:19.200 --> 00:51:22.380
Maria Kieferova: um Okay, so it depends okay.
289
00:51:24.210 --> 00:51:28.380
Maria Kieferova: Okay, so i'm really looking at scale divergence between some probability distributions or.
290
00:51:28.380 --> 00:51:32.010
Marius Junge: i'm looking at the relative entropy of the corresponding.
291
00:51:32.010 --> 00:51:32.880
Marius Junge: Protected state.
292
00:51:33.480 --> 00:51:43.200
Maria Kieferova: That yeah um yeah so we were looking at the relative entropy in I think 2016 in the context of quantum baltimore machines.
293
00:51:43.560 --> 00:51:55.800
Maria Kieferova: But the difficulty, there is that if you have some hidden units and you want to compute the gradients or relative antrobus the you won't really get a close form of the gradients it becomes.
294
00:51:57.390 --> 00:52:00.420
Marius Junge: Okay, so that's different from the CME group story, where you actually.
295
00:52:00.420 --> 00:52:01.440
Marius Junge: do get a close one.
296
00:52:02.100 --> 00:52:02.910
Maria Kieferova: mm hmm yeah.
297
00:52:03.180 --> 00:52:04.770
Maria Kieferova: If you do have.
298
00:52:05.370 --> 00:52:09.030
Marius Junge: In some cases it's in some cases, you have a close, you have actually.
299
00:52:09.120 --> 00:52:10.620
Marius Junge: very nice expression okay.
300
00:52:10.650 --> 00:52:11.280
Maria Kieferova: yeah yes.
301
00:52:11.340 --> 00:52:12.330
Marius Junge: very helpful, thank you.
302
00:52:12.960 --> 00:52:13.230
know.
303
00:52:22.290 --> 00:52:32.580
Maria Kieferova: Also, if people have questions later on or if people are not comfortable asking online feel free to reach out to me and later it's easy to find my email online.
304
00:52:35.460 --> 00:52:48.450
Nicholas LaRacuente: All right, well, I don't see any additional questions in the chat or any raised hands, so if anyone has any further questions you have another few seconds.
305
00:52:50.490 --> 00:52:54.930
Nicholas LaRacuente: And, otherwise I think that we will be taking a.
306
00:52:56.460 --> 00:52:58.110
Nicholas LaRacuente: break until.