5
the canonical a probabilistic model for temporal or sequential data is what's called a Markov model and Markov models are named after Andrey Markov who was a Russian mathematician who did some work in stochastic processes in the late 1800s and early 1900's and if I'm not mistaken this photograph was taken right after Markov won the prize for the world's pointiest beard so what are Markov Models? The idea behind a Markov model in exploits a very sort of deep intrinsic fact about the real world, and that is that the future is independent of the past given the present. In other words if you know the exact state of the world right now and you wanted to use that knowledge to predict the future then knowing something about the past about the state of the world 1 second ago or 1 day ago is not going to help you predict the future because you know everything that you could want to know is already is already encoded in the current state of the world so this is true at least in this sort of classical Newtonian view of the world and I don't really know enough about quantum theory and all to know whether this is true in that setting but it’s at least true, I mean it's hard to sort of imagine how this couldn't be true once you get your head wrapped around what this means and so because at least on large scales this is a fundamental fact about the world Markov Models can be used to model an extraordinarily large number of other applications so whenever we're using Markov Models we're thinking about temporal data or some sort of sequencing data so for example some things for which Markov Models can or are you can be or are used would be like whether or economic data like fine in finance language Markov models are often use for for language music even music just a whole litany of things I could go on and on the list of things that Markov Models are used for so first so let me give you I wanted to give you a couple sorta Rio examples I love up applications are Markov Models things for which they're used so I use the I user speech recognition software called Dragon Naturally Speaking and let me demonstrate a little bit for you here speech recognition software programs use Markov models to listen to the sound of your voice and converted into tax period so you can see well I guess I got most of it right but a mess up here the and and verdict inter taxed so it's not perfect but these these programs work quite well and it's actually it would probably do much better except I'm using a new microphone right now and I haven't traded on the new microphone so that one application to

The Canonical a Probabilistic Model for Temporal or Sequential Data is What

Embed Size (px)

DESCRIPTION

fdhg

Citation preview

the canonical a probabilistic model for temporal or sequential data is what's called a Markov model and Markov models are named after Andrey Markov who was a Russian mathematician who did some work in stochastic processes in the late 1800s and early 1900's and if I'm not mistaken this photograph was taken right after Markov won the prize for the world's pointiest beard so what are Markov Models?The idea behind a Markov model in exploits a very sort of deep intrinsic fact about the real world, and that is that the future is independent of the past given the present.In other words if you know the exact state of the world right now and you wanted to use that knowledge to predict the future then knowing something about the past about the state of the world 1 second ago or 1 day ago is not going to help you predict the future because you know everything that you could want to know is already is already encoded in the current state of the world so this is true at least in this sort of classical Newtonian view of the world and I don't really know enough about quantum theory and all to know whether this is true in that setting but its at least true, I mean it's hard to sort of imagine how this couldn't be true once you get your head wrapped around what this means and so because at least on large scales this is a fundamental fact about the world Markov Models can be used to model an extraordinarily large number of other applications so whenever we're using Markov Models we're thinking about temporal data or some sort of sequencing data so for example some things for which Markov Modelscan or are you can be or are used would belike whether or economicdata like fine in finance languageMarkov models are often use forfor language music even musicjust a whole litany of things I could go on and onthe list of things that Markov Models are used for so firstso let me give you I wanted to give you a couple sorta Rio examplesI love up applications are Markov Modelsthings for which they're usedso I use the I userspeech recognition software called Dragon Naturally Speakingand let me demonstrate a little bit for you herespeech recognition software programs use Markov models to listen to the sound ofyour voiceand converted into tax period so you can see well I guess I got most of it rightbut a mess up here the and and verdict inter taxedso it's not perfect but thesethese programs work quite well and it's actually it would probably do muchbetter except I'm using a new microphone right nowand I haven't traded on the new microphone so that one application tosee also dragon naturally speaking I think I don't know all details are theretheir implementation but bay and I think window speech recognition usesMarkov Models to to modelwhat you're saying so that's onesorta application of Markov models and let me give you another one year mark ofmiles are also usedto generate musicso musical compositions and let me play hereI downloaded a little clip othera composition that was generated I'm not sure if it was generated using a Markovmodel butthey do use Markov models for this type of thing in this was an interestingexample automatically generated music so so here goesonokay that's so that a little better but I'm gonna stop therehopefully you can hear that hopefully wasn't too quiet so those are a couplecoupleapplications and one other a just over at sorta reallyentertaining one is a few if you want to look upfor language an example of a a Markov model that was appliedto to generate language if you look uplookup mark vchigney on Wikipediaentertaining entryin Wikipedia there so let me now give you a little bit more concrete examplesof some data that we may be interested inmodeling using Markov Models and will start so we'll build up tothe definition other Markov model by disorder exploringwhat we might want how to Bay a model of temporal dataso first let's suppose we hadsome data maybe I'll draw some taxis hereawesome axes will say we got some data that looksmaybe so this is so I'm gonna be thinking about a sort of climatesorta example maybe this is years on this accessyears and we get some data points they have some sort ofsome randomness but there's also some sort ofpattern some sort of temporal pattern heresome some sort of periodicity that is correlated withthe years some for seasonal changeso so for example maybe this would be something likeco2 levels in the atmosphere so there's a sort of periodicity in there's alsosort of a drift here I guess and what I drewand this is the kind of temporal or time sequence datathat you might want to use a Markov model or you know a generalization of aMarkov model foranother example would besay you say you're a they are trying to design arobot you using GPS coordinates to figure out where it is so they're robotsgoing to be driving aroundand you've got latitudes and longitude you get a sequencea GPS points and you want to figure out you wanna keep track of where your robotisso your software program as good as you gonna get one point here and thenmaybe another point that latitude longitude a third pointso over time so this is a time one time to time three-time fortime five and he's all represent be exact position I V robotthey're sort of noisy measurements of the positionand maybe this is time tea or somethingand what you want to do is figure outthe actual position at time t so this is the kind of thing thatthat you would use a Markov model or or some sort orsome sort of Markov model for and one more example so let me give you one moreexamplesomething sort a completely different why mention language beforewhiteis the wordat the end thisso I'm sure you could fill in the blank here and Markov Modelsare often use for language to model taxed and and natural language like thisso filling in the blank in in a sense like this is something that you mightuse a mark of mark foralright so those were just a fewfew little examples and now let's start to think a little more formallyso with these examples or have in mind so we get some data I'llright we get some let me just ridersx1 to expand so the to be like GPSpositions or co2 levels or or what what have youand we're gonna model this data so this this is sequential data nowwere thinking about a sequential so well you know we want to have someprobabilistic modelso we may as well take some random variablesX 12 X an to model this data reasonableI mean course for Kenny's probabilities that's the natural thing to dobut what probabilistic model should we use well we couldI mean maybe the simplest thing would be we could take them to beII D but is that really go is that going to capturebe the dependencies that that were interested in like for example hereso with the co2 levels there's clearly somesome dependency between point which are nearbyif you're nearby in time then you co2 level is going to becloser also right maybe on average it'll tend to be closerso the idea sumptin is not really cutting it because it saying thatthere's no dependenciesbetween them so i ID its is notnot going to workso wet dependencies matterright what the penalties do we really want to capture I'm Incif we just a lot everything to to be dependent on everything elsethen we we get a totally intractable model so we'd like theyou know you some sort have some sort of independence thatto make things more tractable and what's your dependenciesmatter I knew a guy who was aa weatherman well he he used to be a weatherman11:09and he always used to say that for the most accurate forecast11:13up tomorrow's weather look out your window or in other words11:19the most accurate prediction up what's going to happen in the near future11:22is what's happening right now and so that's sort of like like this there's11:27these dependencies11:28and sort of like this year with the GPS in a might not always be the same but at11:33least11:33these sort of suggest and the same thing here in a sentence11:37the sort of suggest bad the most11:41information you know for looking at X n plus one11:44we're gonna get the most information by looking at11:47axe 10 by looking at the most recent points or maybe accent -111:52or X and -2 too but but by looking at more recent data11:56we're going to get much more information that by looking at12:00data from the distant past so12:05recent data tells you more ban12:08the distant the recent past tell you more than a distant past12:12so this suggests that we could do the following:12:16we could make XT12:19axe at time t so if we were designing some generative model we could sue12:25choose x112:26and then choose x2 to depend on x1 in12:30and in general we could choose XT to depend on12:33the most recent points12:37X t-minus one XT -2 and so on back to say12:41X T-minus am12:45for some fixed them12:51so if we were to write down we can write down a a generative model which12:54which satisfied this and the simplest case12:57the simplest case would be am equals 113:04so this is going to lead us to the definition13:11are a Markov chain and I'm13:14about to run out of time in this video and I want to make sure to give13:17full treatment to the definition so let me stop there13:20and we will come back and we'll we'll continue talking about Markov models and13:25Markov chains