WARNING: Summaries are generated by a large language model and may be inaccurate. We suggest that you use the synopsis, short and long summaries only as a loose guide to the topics discussed. The model may attribute to the speaker or other participants views that they do not in fact hold. It may also attribute to the speaker views expressed by other participants, or vice versa. The raw transcript (see the bottom of the page) is likely to be a more accurate representation of the seminar content, except for errors at the level of individual words during transcription.

Synopsis


In this video, we explore how the mathematics of solid state physics and Singular Learning Theory (SLT) can be used to gain a deeper understanding of each other. We discuss the dispersion relation, KL Divergence, twisted bilayer graphene, Scanning Tunneling Microscopes, Van Hove singularities, and more. We also analyze how RG techniques, phonons and other tools can be used to bridge the gap between theory and experiment to create toy models that reveal the things seen in experiments.

Short Summary


Solid state physics and singular learning theory use similar mathematics to inform a deeper understanding of each other. The dispersion relation leads to singularities in the density of states which manifest as observable bulk electrical properties. KL Divergence is a measure of energy in SLT which is similar to parameters in SLT, determining the density of states and learning behavior. Twisted bilayer graphene is an example of a material where singularities in the density of states determine observable macroscopic bulk properties.
In Solid-state Lattice Theory (SLT), bands are not present in the literal sense since there are no fermions. However, wave functions in a crystal can form plane waves or linear combinations of different representations, each with a different energy formula. These different formulas give different branches of the dispersion relation, making bands a useful concept in predicting electrical properties such as whether a material is a conductor, semiconductor or insulator, and how it responds to electric fields. Care must be taken when applying intuition from solid state physics to SLT as electrical properties are not always applicable.
Solid state physics is typically formulated in momentum space, whereas SLT is formulated in position space. Scanning Tunneling Microscopes (STM) use a sharp metal tip to measure the current between a sample and the microscope when a voltage is applied. This is related to solid state physics and can help us understand the density of states and the Bayesian posterior. STM tunneling current is used to observe atomic scale objects and is proportional to the differential conductance determined by the tunneling probability, which is exponentially dependent on the gap Z. The differential conductance is proportional to the tunneling probability, the amplitude and density of states, the wave function and the tunneling electron energy. The density of states is determined by the energy of the tunneling electron.
Scanning tunneling microscopes allow for the observation of individual quantum states, seen in the form of quantum corrals. By tuning the bias voltage, the tunneling electron energy can be adjusted to observe the amplitude of the wave function, as well as divergences in the density of states. Differential conductance is a measure of the energy function's dispersion relation and can be plotted to show divergences and singularities in the dispersion relation. The density of states for carbon nanotubes is derived from a simple exercise in first year quantum mechanics.
In a sensor, the wave function is split into two Hilbert spaces, one for the Z Direction and one for the X Y directions. Energy is determined by two quantum numbers, I and J, and the Z momentum, resulting in a family of 1D sub bands with a fixed momentum. Van Hove singularities occur when bands become newly accessible, resulting in an instantaneous infinite change in the number of states. This effect is seen in 1D systems, such as the free electron gas, and the structure of the bands in a carbon nanotube are marked by horizontal lines. These singularities affect the electrical and optical properties of 1D systems.
Semiconducting nanotubes exhibit enhanced optical emission and absorption due to singularities in the density of states, as described by Fermi's Golden Rule. These singularities are visible in graphs of the density of states, where the transition rate between two states is enhanced at points where the density of states diverges. Van Hove singularity is a useful analogy to understand the physical behavior in SLT, where the large number of available low energy states due to the singular DOS determines the physical behavior. Renormalization group methods can be used to understand the details of what's happening near the magic angle in twisted bilayer graphene.
Solid state physics is divided into experimental, theoretical and mathematical physics. Experimental physics involves measuring properties of materials, while theoretical physics involves approximations and predictions. Mathematical physics provides concepts such as groups, topology and geometry. SLT requires an analog of theoretical physics to apply it to deep learning and AI alignment, involving a set of principles and systems to make deductions. Experiments are needed to know if the approximations are correct, and toy models can be created to bridge the gap between theory and experiment.
RG techniques from solid state physics can be used to find toy models that reveal the things seen in experiments. These toy models can be used to prove theorems and compare the movement of trajectories to the current. When an electron transitions from one state to another, it emits a photon, which is similar to the plots from Watanabe's book. Phonons are perturbations of the crystal lattice, which can be used to inject energy into the system. This is analogous to raising temperature or injecting energy in a quantum system.

Long Summary


Solid state physics and singular learning theory can be described using similar mathematics. This allows the intuitions and approximations found in solid state physics to inform a deeper understanding of singular learning theory. Last time, the dispersion relation, density of states, Fermi energy and Fermi level, and band gaps and semiconductors were discussed. These ingredients and simple mathematics can be used to attempt to describe a semiconductor. By going from macrostates to microstates, the symmetries of the microstates can be determined.
Solid state physics and SLT have a connection in the form of the dispersion relation, which is the energy and momentum of a particle in a given system. This dispersion relation leads to singularities in the density of states, which manifest as observable bulk electrical properties. Carbon nanotubes are an example of this, where the singularities in the density of states lead to various phases at the magic angle.
KL Divergence is a measure of energy in SLT and is similar to parameters in SLT as both determine the density of states and hence many electrical properties, such as learning behavior. Twisted bilayer graphene is an example of a material where singularities in the density of states determine observable and important macroscopic bulk properties. However, SLT is completely bosonic and only deals with functions, not differential forms.
SLT is a general Theory which takes into account noise and stochasticity, and could potentially include fermions. Care must be taken when applying intuition from solid state physics to SLT, as electrical properties are not always applicable. The Fermi Dirac distribution can become the Boltzmann distribution in some limit, and bands are more complicated than they appear. Band structure determines whether a material is a conductor, semiconductor or insulator, and affects its response to electric fields.
Bands are a concept used in the study of electrical properties of materials. In SLT, bands are not present in the literal sense as there are no fermions, however, the role they play in predicting electrical properties may have an equivalent. Wave functions in a crystal can be plane waves or linear combinations of different representations, each with a different energy formula. These different formulas give different branches of the dispersion relation, making bands a useful concept in predicting electrical properties.
Solid state physics is typically formulated in momentum space, while SLT is formulated in position space. In solid state physics, the material is infinite and periodic, whereas in SLT there is no similar structure on the space of parameters. To describe the electrical properties of solids, a density of states with particular degeneracies is needed. In SLT, this can be achieved through models with many phases, which can be used to get the same kinds of results as those obtained from the crystal point of view.
Scanning Tunneling Microscopes (STM) use a sharp metal tip to measure the current flowing between a sample and the microscope when a voltage is applied. The current is proportional to the tunneling probability, which is an exponential of various terms. This is a concept that can be applied to learning machines and is related to solid state physics. It can help us understand the density of states and the Bayesian posterior.
STM tunneling current is used to observe atomic scale objects and is proportional to the differential conductance. This is determined by the tunneling probability, which is exponentially dependent on the gap Z. A 0.1 nanometer change in Z can lead to an order of magnitude change in the tunneling probability. The differential conductance is proportional to the tunneling probability, the amplitude and density of states, the wave function and the tunneling electron energy. The density of states is determined by the energy of the tunneling electron, which is equal to one of the ejs.
Scanning tunneling microscopes allow for the observation of individual quantum states. By tuning the bias voltage, the tunneling electron energy can be adjusted to a particular value, allowing for the amplitude of the wave function to be observed. This is seen in the form of quantum corrals, where the differential conductance is plotted in a scale, showing rings which represent a quantum state. Additionally, by tuning the bias voltage, divergences in the density of states can be observed, as the right hand side of the formula will blow up as V approaches a critical value.
Differential conductance is a measure of the energy function's dispersion relation. On the right side of the plot, the y-axis is scaled by voltage and the x-axis is the voltage bias. The graph shows divergences which come from divergences in the density of states and singularities in the dispersion relation. On the left side of the plot is energy versus density, showing the conduction and valence bands, as well as absorption and fluorescence. The lecturer then explains the derivation of the density of states for carbon nanotubes, which is a simple exercise in first year quantum mechanics.
In a sensor, the only interesting activity occurs in the Z Direction. The wave function can be split into a product of two Hilbert spaces, one for the Z Direction and one for the X Y directions. The energy of the wave function is determined by two quantum numbers, I and J, and the Z momentum. The dispersion relation is a family of 1D sub bands, each with a different wave function and a fixed momentum. The density of states is determined by the energy contribution from the X Y directions and the free electron gas in one dimension. The number of states is proportional to the volume of a sphere radius.
Van Hove singularities occur when bands become newly accessible, resulting in an instantaneous infinite change in the number of states. This effect can be seen in 1D systems, illustrated by a free electron gas. In the example of a carbon nanotube, multiple nanotubes of different radii and chirality contribute to the structure of the bands, which are marked by horizontal lines. The absorption and fluorescence arrows are related to this structure, and the phenomenon of van Hove singularities affects the electrical and optical properties of 1D systems.
Optical emission and absorption of semiconducting nanotubes is dominated by singularities in the density of states. This is described by Fermi's Golden Rule, which states that the transition rate between two states is enhanced in places where the density of states is divergent. This is illustrated by a graph, with the density of states diverging at Epsilon C2 and the graph of Epsilon minus Epsilon C2 to the minus one half. The formula for the transition rate includes a delta function, which is dependent on the frequency of the incident photon and the energy of the final state.
Density of states in a semiconductor can be divergent, leading to an enhanced response when the gap between two states, H Omega, is the same frequency as the light. This can be seen in the color of the material, and is known as Van Hove singularity. This analogy can be useful in understanding physical behavior in SLT, where the large number of available low energy states due to the singular DOS determines the physical behavior. Renormalization group methods can be used to understand the details of what's happening near the magic angle in twisted bilayer graphene.
Solid state physics is divided into experimental, theoretical and mathematical physics. Experimental physics involves measuring properties of materials such as carbon nanotubes and graphene. Theoretical physics involves approximations and predictions to explain experimental results, such as the density of states, perturbations and G methods. Mathematical physics provides concepts such as groups, topology and geometry to explain the theoretical physics.
Solid state physics is a layered pyramid of experiment, theory and mathematics. The theoretical layer suggests new approximations and looks for new physics. Devices like the scanning tunneling microscope are designed on the basis of the theoretical physics and interpreted on that basis. The density of State plots and differential conductance plots can be used to infer the band structure of a material. Theory and mathematics are distinct, and a theoretical physicist is not always the same as a mathematical physicist.
SLT requires an analog of theoretical physics in order to apply it to deep learning and AI alignment. This involves a set of principles informed by mathematical foundations, and systems to make deductions on the basis of these approximations. Experiments are needed to know if the approximations are correct. Toy models should be created to replicate the theoretical approximations and bridge the gap between theory and experiment. This can be done with devices akin to scanning tunneling microscopes, which are sensitive to the density of states and can reveal the structure of the density of states.
RG techniques from solid state physics should be borrowed to find toy models that reveal the things seen in experiments. Toy models should be used to prove lots of beautiful theorems, but they need to be supplemented with other layers. It is possible to compare the movement of trajectories to the current, and a formula with the right hand side that has the density of states can be used to do this. It is possible to build devices that can simulate trajectories near singular level sets.
When an electron transitions from one state to another, it will emit a photon. This is similar to the plots from Watanabe's book, where the energy of the Kullback-Leibler Divergence is plotted from 0 to infinity. These divergences are singular level sets of K, and represent a phase transition. In order to measure this, the hamiltonian is coupled to other degrees of freedom, such as the position degrees of freedom of a particle. This allows us to read out information from the system.
Phonons are perturbations of the crystal lattice away from its ground state due to intermolecular forces between the atoms and ions. These perturbations can be thought of as Goldstone bosons, due to the symmetry of the lattice. They can be used to inject energy into the system, similar to an incident photon. This is analogous to raising temperature or injecting energy in a quantum system.
Phonons are a way of pumping energy into a crystal lattice, travelling around and interacting with electrons, and are a big part of solid-state physics. Toy models, such as small neural and transformer networks, should be prioritised as they are hard to prove things about, but are important to the alignment community. Careful balance should be kept between rigorous and non-rigorous approaches, and an insane number of variations should be tried. Sampling from a true distribution can be used to move the tradition closer or further away from a minimum.

Raw Transcript


all right so welcome everybody so I'm going to continue the discussion from last time which was about the connection between solid state physics and singular learning theory um I want to be clear that it's not like a precise correspondence right it's more like in the vein of much of the application of mathematics uh these two things apparently different can be described using similar mathematics and therefore the intuitions that we have in solid state physics and particularly the approximations that we find useful might also inform a deeper understanding of singular learning theory so maybe I'll just put up a brief recollection from last time as a solid state physics too so last time we had the dispersion relation so remember that was the function that computes the energy from the wave number which parameterizes an electron state for us that it's not the only um valid quantum numbers for for an electron but usually we incorporate the other things just by having multiple energy functions I'll talk about that a bit later so we talk about dispersion relations the density of states remember this is the number of new States per unit volume made available when you increase the energy from E to Delta e dos we talked about the Fermi energy and Fermi level so the idea there was that all the states below the Fermi level occupied surface so you can imagine that there's some states you organize them by energy and then below a given energy well some of the states are there and then at the Fermi level you're looking at a level set of the energy that has some topology uh geometry and the the geometry of that level set is what determines whether how the density of States behaves so the dense States has divergences for example where that level set changes its topology well that's one way in which it will have a Divergence not the only one okay and we talked about band gaps and semiconductors and the point of that was to illustrate the process of model making so using these ingredients and some simple mathematics how you can attempt to describe something like a semiconductor and have a sort of reasonable um sort of First Take on the theory of the the physics there and there was a kind of General Paradigm which I was advertising which was the ability to go from microstates and you could say their symmetries that's unobservable right by definition essentially maybe not by definition but uh we sort of make up the microstates right we're starting from the macroscopic world in our measurements and then uh
we're sort of coming up with some notion of microstate we typically can't observe those microstates directly um but the say symmetries of the microstates that's the dominating factor in solid state physics and crystals it's the Symmetry group of the crystal lattice um that determines the kind of wave functions for electrons that can exist in that medium and through that determines the density of states the density of states is is again a sort of mathematical object um we can't observe it directly but we can observe bulk electrical properties like whether a material is a semiconductor foreign thing I want to note in passing is that it's uh well not in passing it's actually the main purpose of this of this talk uh it's the singularities in the dispersion relation which create divergences in the density of states that lead to the most observable properties of the bulk right so far from being these measure zero things that are on a like just irrelevant because they occupy such a tiny fraction of the space of momenta or an SLT the space of parameters the singularities determine the most important factors of the bulk electrical properties now not everything you can you know I don't want to overstate the role of singularities in solid-state physics or elsewhere um but that is the case Okay so last time we also discussed in Twisted by layer graphene our singularities in the dispersion relation um as I was just explaining manifest themselves as interesting macroscopic physics there's a whole literature on how the precise nature of the divergences and the singularities contributes to the various observed kind of phases of Twisted by layer graphene at the magic angle um so that's kind of it's its own thing so I won't say much more about that but what I'm going to treat in detail today is exactly those divergences in the density of states for carbon nanotubes so we'll go through that in some detail and you'll see all of this this kind of Paradigm worked out in that in that example but before I get to carbon nanotubes what I want to do is spend some time on the analogies and dis analogies to SLT I sort of gave myself some leeway last time and not sort of stopping every five minutes to point out the ways in which it was different from solid state to physics okay but now I want to caveat a lot of these things okay so I did point out that the dispersion relation the core of the connection between solid state physics and SLT is that the dispersion relation all right well that is the energy and
the KL Divergence is also appropriately thought of as an energy in SLT so e of K and K of w and K being the KL Divergence W are parameter in SLT well they're similar in that they're both reasonably thought of as an energy and they're also similar in the sense that singularities in both determine the density of states or other divergences in the density of states and hence many electrical properties resp learning Behavior I'm using this in the sense what a nabia does I mean you know the kind of Bayesian things not of the precise trajectories or nature of trajectories of SGD or anything so asymptotic free energy generalization era and so on so we understand in both cases how this works I won't say any more about it so that's um that's an important valid analogy between the two well maybe I'll skip this one and sort of maybe can come back at the end um maybe I'm at risk of belaboring this point um okay I'll just mention that you can make Twisted by layer graphene with a pencil and Scotch tape I mean you won't be testing whether it's a superconductor without a lab but making it is easy okay so the point of saying this is that you might get the impression Twisted by layer graphene this super exotic material it exists only at four Labs on Earth and you you know you need to be doing some insane synthesis procedure to make it and in those kinds of weird materials yeah maybe singularities really matter but that is absolutely not the case right tbg is easy to make and well so a carbon nanotubes and many other materials in which this Paradigm of divergences in the density of States determining observable and important macroscopic bulk properties like electrical Properties or Optical properties this is just bog standard this is normal it's everywhere and of course it's also true instead mystical learning theory as we understand so there's nothing exotic about singularities right they're all around you all the time um making a difference to very ordinary physics okay but I want to kind of share a note of caution here um okay but so we've been talking about electrical properties like whether a material is an insulator or a conductor uh obviously that's about electrons right and there are no electrons in SLT or more to the point no fermions in SLT in sort of the physics language what we're talking about in SLT is it's just completely bosonic we're just talking about functions there are no there are no differential forms at least in the theory is currently formulated you can actually formulate a
kind of reasonable Theory which is a generalization of SLT that takes into account the noise the stochasticity in a way that is making use of ideas from supersymmetry and that probably is interesting and important so maybe you can put fermions into SLT it's not there right now and that's actually sort of beside the point it's not uh the point isn't to look at solid state physics and be say ah there's energy levels and I mean you fill up to the Fermi level why do you fill up well because they're fermion so they can't occupy the same state so they fill up and all of that might look like it it sort of hinges very tightly on the nature of fermions and well last time we were talking about fermud Iraq statistics and the distribution and so on so there is a lot in in solid state physics that you can't just naively transplant to be about SLT for this reason right at least when you're talking about electrical properties of solids but for example there are other properties of solids you can talk about in solid state physics you can talk about phonons you can talk about um a lot of other things that aren't about the fermionic degrees of freedom those you might find an easier type translating to SLT etc etc so you have to be a bit careful about what intuition you draw from solid state physics but I'm asserting as someone who thinks they understand both that this isn't an obstacle to drawing some rather deep insights from from one to the other necessarily uh yeah I just mentioned we saw last time actually a case where the the Fermi Dirac distribution became the boltzmann distribution in some limit right with this calculation we did at the end of last seminar um so if you're far away from the band Gap um then you you can treat the Fermi to rock distribution of the free electrons as being basically a boltzmann distribution and then we're literally in the same territory as what we're talking about in SLT um but that's not it's not always a good approximation okay so that's kind of a note of course a second note of caution is that the analog of bands is is subtle okay so the electrons in a crystal are arranged in energy bands and whether a material is a conductor or semiconductor or insulator depends on the gaps between these bands and how filled they are and that story is actually more complicated than it looks in Kittle as Recent research has shown but that's kind of the basic story um and the response of a material to applied electric fields depends on the band structure
and so the theory is at least the electrical part of the theory is is organized around bands and and so on and so on um okay so you could ask the question are there bands in SLT and if you just ask the question in the most literal way the answer has to be no again this is to do with there are no fermions in SLT and if we're talking about bands literally as it's formulated in Kittle then no there are no bands in the SLT but I mean if you ask okay this conceptual object that we call bands and the role that it plays in predicting electrical properties and other properties of the material is there something equivalent to that which plays essentially uh the same role or a similar role to the way bands uh appear in in the treatment of electrical properties and materials uh the answer to that is probably yes and it just doesn't it's not justified in exactly the same way as as bands are so maybe maybe the answer is yes it depends you mean the question okay so let me elaborate a little bit so I didn't really say what a band was last time so strictly speaking a band there's a branch of the dispersion relation I've been writing capital e that might seem a bit strange when we talked about the free electron this was just a function right I give you the momentum and three coordinate directions you get the energy but uh uh how should I say this so um if you think about the wave functions propagating in a regular material like a crystal some of them behave like plane waves and those are the ones that basically are like free electrons and the Energy Formula we had last time is correct but those whose momenta kind of match up with the lattice um those the solutions there are no solutions to the Schrodinger equation which are plane waves that that have those momentum the solutions near those momenta are different and that's what's interesting about about crystals and that's where representation Theory enters into solid state physics um and depending on the kind of wave function that you take so the the wave functions that do involve momentum like that you can write them as linear combinations of different kinds of wave functions that transform different representations and those have different formulas for their energy function so you will have for each given momenta you will have multiple different kinds of wave functions that might have that momentum and they each have a different formula for their energy and those will give you different branches of the dispersion relation
and you can graph those and they'll make surfaces that usually don't touch sometimes they do and so that's that's the band structure right so that's where bands come from um okay so the first distinction to SLT is that whole story was about momentum right so this is about this isn't a momentum space picture whereas SLT is usually formulated in what a physicist called position space now if you've dug into the guts of SLT a bit if you look at Spencer's thesis for example or elsewhere you're often Taylor series expanding the uh the log density ratio in some preferred basis of functions and so you know maybe you don't usually take a basis that looks like plane waves but you could introduce a kind of momentum space picture into SLT if you wanted so far it doesn't seem to have been useful but it's maybe not playing such an important role now um and okay another difference is that in SSP in solid state physics the material is infinite and periodic in SLT there's no similar structure on the space of parameters right so the the phenomena I was just discussing in momentum space for s for solid state physics comes from some kind of periodic irregularity in real space position space but we don't have some kind of structure on the space of parameters that's analogous to that in in SLT so we're not going to get crystals or anything like crystals or at least I don't see any way of doing that all right so that kind of prevents a naive translation of bands um into solids into SLT okay but uh so nevertheless so being able to do physics is is about in some large part having many different ways of kind of viewing your theory right is the crystal the fundamental thing or is the density of states the fundamental thing well you get to change your mind about that from the crystal point of view what we were just discussing seems crucial to the description of the electrical property of solids but if you can derive what you need from the density of States then okay one way of getting a density of states with those particular degeneracies is to have some kind of symmetry which reflects itself in the wave functions and therefore in the bands and therefore in the dispersion relation and therefore in the density of States but if you give me some other way of getting those degeneracies in the density of States maybe I'm just as happy to say the same kinds of things right so one way in which this could be the case in SLT is in models with sufficiently um many phases and we've discussed phases in this
seminar I won't go into it now I mean phases of the Bayesian posterior so I'm not saying phases and bands are the same thing that's not I mean especially in solid state physics they're I mean they're related but they're not like the same thing that's not a dictionary I'm setting up here but uh if you have many phases of the Bayesian posterior which create dense enough intervals of free Energies some of the same intuitions around bands May apply this has got a big question mark on it crazy okay uh let's see yeah maybe I'll pause here for questions so that's I'm going to move on to talking about spectroscopy and carbon nanowires now um but regarding this sort of Quasi dictionary between solid state physics and um SLT any questions all right some stuff okay let's talk about spectroscopy this is really cool this was a physics undergrad I never knew whether to be appalled or delighted at how easy the math was it's like I thought through the Deep secrets of the universe were going to be really hardcore math and maybe they are later you find that out but this is amazing what you can do with with very simple mathematics so uh so let me tell you about scanning tunneling microscopes so when a scanning tunneling a microscope maybe you don't care about microscopes too bad no I mean you need to sort of understand the formula uh somebody committed suicide at that comment sorry to upset you um so the the formula which I'm going to give you for the differential conductance which is how you think about what a scan like tunneling microscope is doing I want you to see that formula understand the role of the density of states in it and just get y actually it's completely reasonable to think about an analogous device for learning machines right and that's the kind of uh bridge that I want you to be able to walk across between these two Fields right okay so in a scanning tongue microscope an instrument with a sharp metal tip which is about one atom is brought to within a nanometer of the conducting sample so here's the sample and here's the distance Z so a voltage bias so voltage is applied across the sample and there's a tunneling current that flows from the sample to the wire and to the microscope and we measure that current all right so that's V that's I and the current is proportional to the tunning tunneling probability okay so let's now the tunneling probability itself is proportional to the exponential of some stuff we don't care about um maybe I'll say briefly what the terms in
the bracket are but all we care about is this so Phi is the effective barrier height uh that you that the electron has to Tunnel across all right so we have this exponential dependence on the on the Gap Z Now the result of that exponential sensitivity is that uh very small changes in that distance Z translate to large changes in the tunneling probability and hence the observed current so for example a 0.1 nanometer change in Z can lead to an order of magnitude change in the tunneling probability all right so the STM current so as you probably know this is how you see atomic scale objects the STM tunneling current um I as a function of bias V give spatial and spectroscopic information about the quantum States a spatial I guess is obvious because you can move the tip around and observe how it changes spectroscopic for reasons I'll explain in a moment and the the relevant formula here is the following uh so what's called the differential conductance so that's a d i DV is proportional to the tunneling probability I'll explain what this J is in a second the amplitude and density of states so this will take some unpacking giving you this jargon because if you look at some of these papers to do with Twisted by layer graphene or any other solid state literature you'll see this term because this is most of the experimental plots you'll find on the front page of a paper about say graphene will have a plot of the differential conductance against the bias that's one of the standard things you'll see so conductance means the reciprocal of resistance right and you'll remember that V equals IR so V on I is R the reciprocal of R is Ion V that's the conductance differential conductances di on DV all right so on the right hand side here we have this is the wave function so there's like J We're expanding in some basis of eigenstates we have some wave function J some quantum number sorry PSI J some quantum numbers J uh that state to do with J has energy Epsilon J this is to do with the bias voltage V and the charge of electron e energy that's all right charge um this yeah maybe I'll underline it here um so this is the tunneling electron energy right so what this formula says is that uh the thing you can measure which is the differential conductance is proportional to this tunneling probability and these are the terms you see the density of States here this is what this term is density of states okay so it says that if the tunneling electron energy is at so if it's equal to one of these ejs
just think of the density of States as being just the uh just suppose it's discrete right so it's just one or zero depending on this is the Delta function depending on whether the tunneling electron energy is equal to e j Epsilon J or not so if it is you get a contribution to the differential conductance from the amplitude of the wave function at that point so RT is the tip of the this is RT all right so looking at that ignoring the density of states for a moment you can tell that this allows you to see the wave function right or at least the amplitude uh and you can even look at individual states right because you just tune the track the tunneling electron energy by tuning the bias voltage V to be at some particular Epsilon J and then only that state will contribute then you get to see the amplitude of that wave function and indeed you really can see it so let me bear with me for a second and I'll try and get a picture up here yep there it is like I hope you can see that so that's a I can you see that may not have gone through moderation so I hope you can cool uh so that's a what's called a Quantum Corral uh so I should forget I think they're iron atoms in the Corral I believe that's right um don't have Kittle in front of me so and what you're seeing in the middle is so this is a a picture of the differential conductance I guess scale somehow and what you see in the middle is one of the quantum States for the uh I think it's maybe a maybe there's a copper atom in the center or something but you're seeing the amplitude those Rings there is a picture of one of those Quantum States all right and if you tuned the bias voltage differently you would see a different set of structures in there right so this is pretty amazing and that that comes from this formula all right so uh that's how a scanning timeline microscope works I said ignore the density of States now stop ignoring the density of States so uh you can see there that if the density of States If This Were A continuous thing then if you were tuning V maybe I'll go back if you were tuning V and therefore changing the tunneling electron energy and you tuned it past the value of the energy for which there was a Divergence in the density of states the right hand side would blow up as V approaches that particular critical value and therefore so with the differential conductance so you would predict based on this formula that you should be able to see divergences in the density of states in the plots of the differential
conductance and yeah let me show you a plot okay so there on the right hand side maybe you want to make that full screen actually it's a bit blurred on the two board View so on the right hand side I'll talk about the left hand plot in a moment um you see on the y-axis the differential conductance it's scaled by our v i on V just ignore that um that's the y-axis the scaling actually varies on the paper it's not always I on V and on this if I understand it correctly and on the x-axis you see the voltage bias so what you're doing there is you're probing different level sets of the dispersion relation of the energy function right that's what happens when you vary V you're looking at different level sets and you can see these divergences they're not exactly divergences they don't go to Infinity right because that's not how nature works but um really that's it really but our puny mathematical models make this into an infinity okay and that's you know you've got some macroscopic quantity of the material you put a voltage across it you hook up your machine and there you see this beautiful divergences and what these come from are so these are from divergences in the density of states and from singularities in the dispersion relation so what I well I didn't tell you what this graph is for this is this graph is the is the differential conductance for a sample of carbon nanotubes right so the next thing I want to do now is uh go through the derivation of the density of states for this example it's very simple uh and then sort of explain this picture so you can see on the left hand side there's a plot of energy versus density so sort of this is a graph rotated of the density of states you see you see conduction band at the top valence band at the bottom we talked about that last time you see absorption and fluorescence this is absorption and emission of photons this is not electrical properties this is Optical properties so I'll explain how that also depends on the density of states and so on the right hand side is electrical on the left hand side Optical and um yeah hence the term spectroscopy okay any questions about these pictures before I do some math okay so let's do a little bit of math okay so example scale wire for example carbon nanotubes there's nothing specific about it carbon here all right uh that's kind of a little exercise in first year quantum mechanics here which I'll skip so the energies and eigenstates are uh let me see I have another picture of this maybe I'll just draw one
um so it's not one dimensional but it's somehow like discreet in all but one direction in a sensor explain in a moment so and that's the Z Direction this is X and Y I think that's right-handed yeah um okay so we're going to assume that the only interesting stuff happens in the Z Direction so the energies are going to be uh eij Epsilon i j I'll explain that in a second [Music] so here I and J are quantum numbers in the X Z direct X and Y directions so if you wanted to solve this problem you'd set up the X and Y directions as a particle in the Box some infinite potential well you would split the wave function to a product if you think about it as a tensor product of filbert spaces one for the Z Direction one for the X Y directions you solve the problem in the X Y directions assuming it's so you just get these quantized things like a particle in a box labeled by these two quantum numbers i j so PSI i j is some understood functions and we're not going to care about them times and the Z Direction you actually have plain wave Solutions right e to the ikz K is the Z moment okay I guess that explains this uh what about this here well you have some energy from the X Y component of the wave function right and that's uh that's just the energy of this PSI i j and then you have the energy to do with the Z momentum so the dispersion relation I explained earlier is really a family of bands Kittle would say 1D sub bands you have many different wave functions with the same momentum k right with a fixed moment okay you have as many different wave functions with that momenta as you do quantum numbers I comma J and they each determine a different dispersion relation I've written it for you and therefore different bands right okay so the density of States well if I tell you I've got an increase of energy by Epsilon how many new states do you get well you get however many new States you get in each one of these kind of separate kinds of wave function then you add them all together and what about this particular one well if you fix the quantum numbers I J uh we're just talking about the energy then we fix the energy contribution from those directions right and then we're just talking about a free electron gas in one dimension and you know the formula for the density of states for a free electron gas and so if you remember the number of states in a free electron gas and dimension d uh was proportional to e to the D on two just the volume of a sphere radius um radius root e sorry it is so therefore the density of states
is the derivative so it's D on two minus 1. if D is one for a one-dimensional free electron gas I get e to the minus half so d i j Epsilon is just well there's some constants of proportionality I don't think I care I never write them so it's proportional to yeah stuff we don't care about Epsilon minus Epsilon i j to the minus one half okay so you've got us well uh for Epsilon greater than Epsilon i j and it's zero for Epsilon less than Epsilon i j it's got a Divergence when Epsilon is equal to Epsilon i j i just won't put it in there okay so you see that the density of states is the sum of these functions which have divergences at these values of the energy that have to do with some bands becoming accessible so the derivative is infinite well the moment that band becomes accessible there's a an instantaneous infinite change in the number of states right so you might think this is kind of like a cheating way to get a singularity you know kind of agree but uh well this is a real thing mathematically it would be more interesting if there was just a sort of genuine Singularity that didn't involve contributions from these partially defined functions or whatever but that does also really happen that's what's happening in Twisted bilayer graphene for instance but it's harder to explain than this example so okay so we see and anyway this this illustrates the essential point we get divergences when some bands become sort of newly available and these are called van Hove singularities on VHS okay uh I think that's the simplest example of a van hoof Singularity I can think of anyway so these singularities affect the electrical Optical properties of 1D systems as I've already said and let's go back to the picture now and look at this carbon nanotube example so you can see on the left hand side these kind of horizontal lines these are the Epsilon ijs and yeah well it's uh it's a little bit of a lie because it's not like you're actually considering a single carbon nanotube in a sample if you're running a Velo voltage across it you've actually got many carbon nanotubes in your sample and as far as I understand it it's the the different radii and chirality of those carbon nanotubes together which contribute to make these bands that you can see here but okay the the story is correct the the origin of this structure is is to do with these vento singularities okay uh I promise to explain this absorption and fluorescence arrows the rest should be clear so um those graphs that you as well let's see
so this this line here so this is like this value over here is Epsilon C2 C2 for second band in the conductance a second conductance band and then this over here is the graph of Epsilon minus Epsilon C2 to the minus one half right uh okay so we've been talking about electrical properties but uh kind of want to really drive home this point about spectroscopy by talking about the absorption and emission of photons briefly um so the optical looks I think it's just so cool that you can uh see these singularities the optical emission and absorption of semiconducting nanotubes is dominated by these singularities so I'll write the formula but while I'm writing it I'll give you the kind of high level version right so uh maybe I'll even just let you look at the picture in webcam while I'm running so if you look at that left hand picture there um well we've got these divergences in the density of states that's the organizing structure for that graph right now the density of States diverging means you get a lot of new States per unit volume now that means that if you're thinking about transitions between energy states well unless you have some reason to believe otherwise most of the transitions will be between states that are in those places where it diverged right because that just dominates the set of states so to a first approximation what you care about are transitions that involve those Divergent energies or rather electrons that are at those Divergent Energies now that's the kind of uh loose version of this formula which is called fermi's Garden rule I'm not this is just the polarization of the photon and this is the momentum of the electron and so on I don't want to explain what's going on here really in this in this term in Brackets there's a general formula where the thing that goes inside here is the is the hamiltonian and this is a ground state and so it's the sum amplitude term there but what I really want you to pay attention to is the density of States term so what is this saying it's saying that well it's a Delta function when Epsilon J the final energy is equal to the initial energy plus the energy of the incident Photon right depending on its frequency Omega Okay so at places where there are lots of states where the density of states is Divergent you'll get an enhanced response in terms of this this is the transition rate I should have said sorry um so fermi's Golden Rule tells you what the transition rate is between two states uh well in many systems but in particular
in a semiconductor like this um so in where the density of states is a Divergent is Divergent you'll get an enhanced response so there will be many more transitions between two states Epsilon I and Epsilon J if both are places where the density of states diverges and the gap between them so and H Omega is the gap between them so if we go back to this picture here so this would be an example of a h Omega right this you get to choose right that's the the frequency of the light so if you pick the light at the right frequency some frequencies are just really won't do much with it won't absorb them right but if you pick a frequency that is a gap between two of these divergences of the density of States then the system will eat it up it'll absorb it and then re-emit it and the the emission will also be dominated by transitions between these energy levels all right so uh the upshot there is that you can actually look at this material and the color of the material is to do with the singular structure of the density of states and hence the singular well the Divergent structure of the density of states and hence the um hence the singular structure of the dispersion relation right so that's uh you could hardly get more uh self-evident than than the color of the thing obviously it's probably not in the visual Spectrum but whatever okay uh so that's it for the physics part I want to make some more speculative comments on what this means for SLT and research directions but that's sort of different so maybe I'll if you have any questions about the physics story here now is the time to ask yeah foreign we need to find some new boards uh follow me please okay so as I said the analogy between solid state physics and SLT isn't perfect but it's good enough to be useful um I'll just read out some quotes from some of the literature around Van hoof singularities and Twisted by layer graphene to give you a bit of a flavor for the statements you can now understand the references are in the written notes the first one I think this is one of the first papers due to the original group untwisted by Leah graphene says if the Fermi level lies in the vicinity of a van Hove point the singular dos determines the physical Behavior due to the large number of available low energy states hopefully that sounds familiar given what we just discussed second quote this is from a very interesting paper which is applying renormalization group methods to understand some details of what's happening near the magic angle in
tpg quote due to the power law Divergence the density of states the higher high order saddle points are more dominant than the other parts of the Fermi surface at low energy we thus construct the low energy Theory by approximating the Fermi surface with six patches in the vicinity of these points that's the starting point of the RG process high order saddle points is such a gross term they're not saddle points it's like this reluctance to let go of the classification into minimum Maxima and saddle points they're not saddle points the singularities that are more complicated than saddle points but when when you're seeing this literature high order saddle point or higher order VHS what they mean is just the singularity that's more complicated than a cell okay um so let me draw a kind of two pyramids so for those of you who don't sort of have some overlap with physics I just want to explain the relationship between and there's roughly speaking like three parts of physics you could say there's experimental physics uh you could say this theoretical physics obviously there's no hard and fast boundary between these things but physicists will often self-identify into one of these categories so so this is I'm talking about solid-state physics in particular okay so what goes in experimental physics well it's stuff like carbon nanotubes and measuring the differential conductance it's uh Twisted by layer graphene and measuring whether that's a superconductor and all the kind of and all the rest of solid state physics that's experimental um abstracted out of those experiments is theoretical physics and that's what we've been doing basically or discussing right that's what you'll find in Kittle in the parts that aren't presenting numbers from experiments it's theoretical physics and that's organized around in this case things like density of States perturbations from the Assumption of being a free electron G methods and what I wanted to illustrate over the last two seminars was part of how that works right so it's different from proving things it's a set of good approximations and domains in which the approximations are reasonably valid matched to predictions in specific experiments where those approximations and the formulas that flow from them are borne out right and it's kind of a body of knowledge of that form and that's all buttressed by mathematical physics which provides the concepts that you would use to explain the theoretical physics things like groups and topology and geometry and so on
right so the beginning of Kittle books Kittle's book is a discussion of crystal lattice of lattices and uh doesn't really talk about group representations but you could take a second treatment of solid state physics that would do that um there's the topology of Fermi surfaces again that wouldn't be discussed in Kittle but if you went further you would talk about that and then in the modern literature you'll find more about geometry as we've discussed right so that's there are papers about the classification of Ade singularities and catastrophes for experimental physicists in this literature right because of this kind of in the interaction between these layers right so uh the mathematical physics it's not always that proving things there is the essential contribution but it's organizing the kind of you could say hacks that are in the theoretical layer suggesting new ones the proofs take you out further into the territory where you can make new approximations and then look for new physics and that's how it all works together okay okay so that's a very abbreviated take on how it works [Music] um and I want to indicate devices in this hierarchy here we talked about the scanning scanning tunneling microscope so what is the role of a device like that in solid state physics well that's the device that uh when the devices co-evolve with the physics right so the conceptual architecture of solid-state physics has these density of State plots in it right this differential conductance plot and then you look at the Peaks and you say something about the band structure so maybe that's the only time you ever have direct evidence of the band structure in your material is looking at that kind of plot right it's not like somebody hands you the band structure you have to infer it and you infer it from the output of these devices so the devices are designed on the basis of the theoretical physics and interpreted on that basis and they're how you see and organize the experiments all right so forgive me for giving away history of physics lesson but uh now I want to point out that we kind of lack maybe two layers of this pyramid um in SLT okay as we have experiment we have we have Theory and we have math and part of the point of the previous vote is to distinguish these two things a bit right a theoretical physicist is not the same thing as a mathematical physicist uh not always anyway they could be the same you know that could be one person doing both at different times and usually that
is the case but the aim in theoretical physics isn't necessarily to prove things as in a rigorous proof although sometimes the arguments shade into rigorous proofs um and similarly we need something like this in SLT right at least if we're to apply it to deep learning or um to say AI alignment so if we want to do things with large-scale systems about which we'll never prove anything right we're not going to prove that the rlct of a large Transformer model is X so what are we going to do then well we do theoretical physics or some analog of that so we have a bunch of principles informed by the mathematical foundations we know regimes in which those approximations are valid and we know systems of how to make deductions on the basis of those approximations I mean that's what solid state physics is that's what Kittle's book is all right so let me fill in a little bit here so I mean of course you know about we talk about this all the time right and we have statistics we have geometry we have Etc um I mean the kind of thing that would go in here is is part of the theory of ban phases and phase transitions right some of that will be rigorous as much as possible um but it won't always be appropriate for everything we do it depends what we're aiming at right but if we're applying SLT It won't always be that the goal is to prove a thing just as it isn't in theoretical physics so if you're not proving things well then how do you know if it's right or not well that's where the experiments come in right so it's my opinion that we need a body of toy models um like the carbon nanotubes in solid-state physics they exhibit some interesting physics but everybody can reproduce it in their lab it's a simple system so we need toy models and the standard replicable in which the theoretical approximations that we're developing do work right they do describe some of the observed physics or learning Behavior or whatever you want to call it um and the bridge between those two is well something we have to figure out right you might call it a sort of analog of a scanning tunneling microscope so something that is sensitive to the density of states can reveal the structure of the density of states in a way that sort of reflects other things that we're measuring in the experiments how exactly to do that I mean I don't know that's research right but it's uh there are many devices like that in solid state physics and not all of them will sort of have reasonable analogs but they might Inspire some ideas
all right um and another thing we should borrow from solid state physics is the RG techniques right it's it's just that absolutely clear that many of the arguments based on renormalization group flow techniques for example in this paper I'm citing in the notes and that I quoted from earlier one can just do this in SLT you may not be able to prove that it's valid but we should be able to find toy models in which we can do RG based arguments and have those reveal the things we see in the experiments right so that's the kind of thing in addition to proving lots of beautiful theorems which are still very important but we need these other two layers because the the rigorous part is it's hard and takes a lot of time and won't get us to where we need to get on its own okay so I'm going to stop there thanks everybody questions suggestions build the device tell me how to build it it's a really useful framing yeah thanks I would say that part of what Edmund is doing right now in his research is has kind of inching towards building devices like this I don't know if you would disagree with that framing Edmond but I think you would recognize what you're doing in this this picture right so I was about to say about that no no not about saying that but like that I just like oh yeah that's what I'm doing all right when you say that yeah it's it's tempting to think I mean the obvious analog of conductance would be to to look at kind of the movement of trajectories under well let's say under the gradient of K with some noise near w0 right I mean how fast do they move it's not like not crazy to compare that to the current but yeah things like that so if we if we simulate trajectories near singular level sets uh I mean that's not as I said earlier that's not they're not electrons right it's not directly analogous but all we need is a Formula like that with the right hand side that has the density of states in it and then where We're Off to the Races right so I think it's yeah I think it's doable I haven't you know this is the first time I formulated this way in my head was like this morning I haven't thought about it but um I think it's doable is there any um so in solid state physics the um the fact that you know colors arise from just there being so many more available States at those diver those singular uh blow ups sorry not blow-ups that means something else but those you know the when the density of states diverges there are so many more States available that it's overwhelmingly
likely that you're going to transition from one to another and therefore it's those colors that you see in the Spectrum um is there is there anything written in the literature or even I don't know computed by Edmund there is an Argus to that where you see in learning Dynamics a jump I guess that's a phase transition um yeah sorry I jumped on the bike to where from one okay in the um the picture with the left and the right where the the Spectrum picture that Dan showed um there's a photon is emitted when an electron transitions from one state to another and it jumps from those from one Spike on the left hand side of that board to another Spike from the left hand side right and it jumps between those two spikes because there are so many more available sort of ways to be for electrons in the spikes and those spikes correspond to um level sets of the kale Divergence yeah so the plots were showing sometimes from section 7.6 of watanabe's book arguably have a similar explanation um where if you if you kind of coarse grain and if you just line up the the energy the kale Divergence from zero at the bottom to Infinity at the top and then you you look at the density of states and look where it diverges well those divergences would probably be singular level sets of K some of those not all of them but some of them will qualify as phases in the language we've kind of adopted where you know it would be somewhere where the like a local Minima of the KL Divergence uh blah blah blah satisfying some conditions like somewhere where the probability mass of the posterior might concentrate and there will be a Divergence of the density of states there and kind of the it's clear that like somehow that's important transitioning between two of those kinds of densities of States but we lack photons we don't put it differently the the hamiltonian doesn't couple to any degrees of freedom that we can kind of just send in and read out the only degrees of freedom we couple it to I mean when we when we do a sdd run or we put in Markov chain Monte Carlo where especially in Markov chain Monte Carlo actually because we introduce kind of these noise terms or whatever where we're coupling the the loss function the hamiltonian the KL Divergence to other degrees of freedom by the position degrees of freedom of the particle for example um and then like we can read out something from that but we don't have any interactions quite like the I mean the um the effect of hamiltonian for the solid state physical system has
like the fermionic terms in it that involve creating and destroying electron States at different points in the lattice but it also has terms in it that come from QED I guess from interactions between the electrons and the photons um and those terms kind of allow you to pull little bits of energy out or put them in and see what's happening and we don't have an obvious analog of that um at least not that I see I think um I think we can imagine having a trillion particles um twitches in my mind at analogous to um SGD [Music] trajectories but actually end of them running on the lost surfaced and um and then if you keep track of um for proper portion of them stay stays in which region for how long and things like that that could um that could correspond to um density of states and the transition we will have to have some analog of injecting energy into the system so that the the trillion electrons some of them have some transition probability going from last year to do to that the other one um yeah I'm just repeating what Dan said that there's a lot of things that we don't have in our analog on and particularly so particles occupying States and ways to inject energy yeah I guess you could imagine I mean the dumbest thing you could do would just be to like have trajectories and then give it a big hit with some noise and see like if it ends up somewhere or other red that's more analogous to raising temperature than injecting protein yeah but if you just give it like one crazy big step and then it returns to what it was doing um you know what I mean so it's not like uh yeah I don't know um I mean that's that's essentially what happens when there's an incident Photon right or you could think about it as um well maybe phonons are actually a better analog so um phonons how can I explain fernons briefly so the the crystal lattice doesn't stay exactly in its lattice positions there's of course intermolecular forces between the atoms the ions in the lattice that has a ground state which is where they're all where they're supposed to be so to speak but there are perturbations away from that ground state um there's a kind of nice mathematical story there those perturbations can be thought of as what are called Goldstone bosons they're due to the symmetry of the situation in lattice as they're called phonons you can think of that as like the uh I guess depending on your level of physics you think of particles as like real things uh that's not always like sometimes in the mathematics you can
find a thing that behaves like a particle but it's not uh you know not something you can spit out of the LHC of a collision or something um so phonons are like that they're not kind of like electrons but they're degrees of the kind of ways of pumping energy modes you can pump energy into in a crystal lattice and they travel around on the lattice you can think about it like it's the the atomic the most fundamental unit of vibrating an element of the crystal away from its lattice position anyway so those travel through the lattice they interact with the electrons and that's a big part of solid state physics um so I wonder if I wonder if there's even a sense in which yeah I don't know I mean there's lots of noise in SLT right and you for example in the sampling and you might think of maybe there's some way of thinking of some of those perturbations in the system as being analogous to phonons like like if you so suppose we trained something for a while with one sample KN and then we resample and we see what happens right I was going to say like we resample but um it's sampling from a true distribution that is well if it is injecting energy then we want to sampling a true distribution which is uh uh closer to the minimum need me to work it out but but like um so you move the tradition closer or further away depending on whether or not you want to uh lower or um but that's more like changing the ground state yeah I think so yeah we have to keep a careful balance between you know it's it's tempting to do the non-rigorous stuff because proofs are hot so the but we so we need to do both and when we're doing the non-rigorous stuff we need to also be very systematic about it so I think we should like try all the things uh because maybe there's like I mean that's how experimental physics works right you try an insane number of variations of the experiment and the device and then one of them Works um actually this is where another thing I wanted to say and didn't write down was I mentioned toy models I think we should pay very close attention to what the mechanistic interpret interpretability people are doing in the sort of alignment community so for example at anthropic and also at open AI they have these very interesting papers with small rail your networks or small transformer Networks I think we should prioritize I mean maybe we haven't thought so much about them because it's hard for us to prove things about them that's fine um but with regards to this second layer