Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Kirill Eremenko: This is episode number 265 with top instructor in the
space of big data, Frank Kane.
Kirill Eremenko: Welcome to the SuperDataScience podcast. My name
is Kirill Eremenko, Data Science Coach and Lifestyle
Entrepreneur. And each week we bring you inspiring
people and ideas to help you build your successful
career in data science. Thanks for being here today
and now let's make the complex simple.
Kirill Eremenko: This episode of the SuperDataScience podcast is
brought to you by our very own Data Science Insider.
The Data Science Insider is a weekly newsletter for
data scientists, which is designed specifically to help
you find out what have been the latest updates and
what is the most important news in the space of data
science, artificial intelligence and other technologies. It
is completely free and you can sign up at
superdatascience.com/dsi. And the way this works is
that every week there's plenty of updates and
seemingly important information coming out in the
world of technology. But at the same time it is virtually
impossible for a single person on a weekly basis to go
through all of this and find out what is actually really
relevant to a career of a data scientist and what is
actually very important. And that's why our team
curates the top five updates of the week, puts them
into an email and sends it to you.
Kirill Eremenko: So once you sign up for the Data Science Insider, every
single Friday you will receive this email in your inbox.
It doesn't spam your inbox, it just arrives and has the
top five updates with brief descriptions. And that's
what I like the most about it, the descriptions. So you
don't actually even have to read every single article. So
our team has already read these articles for you and
put the summaries into the email so you can simply
just read the updates in the email and be up to speed
in a matter of seconds. And if you like a certain article,
you can click on it and read into it further.
Kirill Eremenko: And so whether you want great ideas that can be used
to boost your next project or you're just curious about
the latest news in technology, the Data Science Insider
is perfect for you. So once again, you can sign up at
www.superdatascience.com/dsi. So make sure not to
miss this opportunity and sign up for the data science
insider today and that way you will join the rest of our
community and start receiving the most important
technology updates relevant to your career already this
week.
Kirill Eremenko: Welcome back to the SuperDataScience podcast, ladies
and gentlemen, super excited to have you back here on
the show today. And the guest for today is somebody
who I've wanted to interview for quite a while now,
Frank Kane. Frank is an expert in the space of big
data. He worked at Amazon for over a decade and you
might actually know him quite well from his courses
on Udemy where he's one of the top instructors in the
space of data science and big data. And today's
conversation was very interesting because we
approached it from two spaces, from the space of data
science and the space of big data. And in this podcast
you'll find out how the two areas have been different
but are now slowly but surely converging into
something that is very intertwined and why it is
important or why it is becoming more and more
important for a data scientist to be well adept in the
space of big data as well.
Kirill Eremenko: Also in this podcast we will talk about Frank's
background, which was very interesting spending over
a decade at Amazon and working on lots of different
systems. There you'll find out very useful tips on
recommender systems such as user-based and item-
based collaborative filtering as well as other types of
recommender systems and where this space of
recommender systems is going. So you can probably
already tell that this podcast is quite heavy on
recommender systems. So if that's your thing, then
this podcast is definitely for you. And you also find out
why recommender systems are important across all
spaces, not just in retail, so how many different
industries can use recommender systems.
Kirill Eremenko: We'll also touch on singular value decomposition or
SVD model based methods, deep learning and Amazon
DSSTNE. And finally towards end of this podcast we
will talk about hiring. So Frank had a huge say at
Amazon on who's hired and who's not hired into the
teams and he's got some really exciting tips to share
with you on this podcast. So can't wait for you to
check out all the great insights from Frank here. And
without further ado, I bring to Frank Kane, one of the
top experts and instructors in the space of big data.
Kirill Eremenko: Welcome back to the SuperDataScience podcast, ladies
and gentlemen, super excited to have you back here on
the show because I've got a great colleague of mine
and a great online instructor and entrepreneur. On the
phone, Frank Kane calling in from Orlando, Florida.
Frank, how are you doing today?
Frank Kane: Doing Great. Kirill, how are you?
Kirill Eremenko: Doing well as well. Such an honor to talk to you again.
We met at Udemy Live, I think it was last year and had
some interesting chats and now we're here on the
podcast. How's things been for you over the past
almost year now?
Frank Kane: Yeah, it's been going great. Things continued to grow
and as I'm sure you know, there seems to be a
boundless demand for online education in the fields of
data science and machine learning and big data. So
we're all kind of riding that wave.
Kirill Eremenko: Yeah, yeah, for sure. And exciting to see new courses
popping up from you. And you mentioned you're
working on some really exciting things right now. What
are the courses that you working on right now?
Frank Kane: Well, I just released a update to our Elasticsearch
course. So kind of lately I've been focusing on the big
data side of things and Elasticsearch is a really
interesting technology that kind of diverged from its
original purpose. That's kind of that the cool thing
about it. So you hear about elastic search and you
think it's just research engines, right? Like powering
search on Wikipedia or something. But it's sort of
morphed into this tool for doing large scale data
analytics and web log dashboards and things like that.
So that's the latest thing I've been up to. Prior to that I
released a new course on recommender systems,
which my time is something we want to talk about as
well.
Kirill Eremenko: Yeah, very cool. And a big shout out goes to Manning
Publications for helping us arrange this podcast. And
it's really funny, like you mentioned, they reached out
to us to arrange the podcast and to promote your new
work while we already knew each other. So like you
said, it's very serendipitous how these things happen
sometimes.
Frank Kane: Yeah, I love that word. Serendipitous. And I mean,
that's a big part of what we do in recommender
systems too, is what we call serendipitous discovery.
This is like a serendipitous connection, small world
kind of a thing.
Kirill Eremenko: Awesome. Awesome. Okay. So yeah, we've got so much
to talk about. You have such a broad, I mean, such an
interesting career path with your time at Amazon and
how you moved to courses. So to kick us off for, I'm
sure a lot of our listeners, those of you who take my
courses on Udemy or SuperDataScience or those of
you who take Frank's courses, there's a huge overlap
in the sense like there's a lot of people who already
know you, but for somebody who doesn't know you or
doesn't know you well, give us a quick rundown, who
is Frank Kane and what has your career being like?
Where has it taken you?
Frank Kane: Yeah, man, it's a long story. I kind of started off as a
software engineer in the video game development
industry of all things. And from that, I went on to
developing flight simulators and one day I got a call
out of the blue from amazon.com in Seattle. And they
said, "Hey, you know, we're looking for good engineers.
Do you want to do a phone interview?" I'm like, "Sure."
And next thing I knew I was moving to Seattle. Right?
And they hired me into their personalization
department and that's basically what we call
recommender systems today. So this is back in 2003, I
think. So real early days of this field and we didn't
even call it-
Kirill Eremenko: Yeah. Data science didn't even exist back then.
Frank Kane: That's exactly what I was going to say. That wasn't
even a thing. That term wasn't even coined yet, but we
were doing it. It was kind of a-
Kirill Eremenko: Yeah, the minor seventh year of data science.
Frank Kane: Yeah. And we were kind of inventing it as we went.
Right?
Kirill Eremenko: Yeah.
Frank Kane: So it was exciting to be a part of that. And yeah, I
stuck it out at Amazon for 10 years, almost 10 years
anyway.
Kirill Eremenko: Wow.
Frank Kane: And work my way up from software engineer to a
senior manager. And by the end of my career there, I
was actually running the engineering department of
IMDB.com, which is a subsidiary of Amazon. So that
was fun. It's a big movie website if you're not familiar
with it. But yeah, after 10 years, it was time for
something new. My family was itching to get out of the
rainy environment of Seattle. So we decided to make a
go of it on our own and packed up and moved down
here to Orlando and been working for myself ever
since.
Kirill Eremenko: Yeah. Orlando is great. Right? I was there once and
you guys have Universal Studios parks, theme parks
there, right?
Frank Kane: Yeah. Universal, Disney World, Sea world. It's a
definitely a fun place to be, especially if you have kids.
Kirill Eremenko: That's awesome. How many kids do you have?
Frank Kane: Two daughters, they're both grown up now, but when
we moved here, they were still young enough to enjoy
it. So it's been fun.
Kirill Eremenko: That's awesome. Okay. And so 10 years in Amazon,
amazing. Really, really cool. And you moved there from
a software engineer or senior manager and then
managing a whole department at IMDB. How was that
like? How was it like working at Amazon?
Frank Kane: It was exciting. I mean, the thing that I love the most
about it was that you're always surrounded by really
smart people and you're never going to have a problem
finding people that are smarter than you to learn from.
Right? So a lot of people say that if you're not learning,
you're in the wrong job, right?
Kirill Eremenko: That's true.
Frank Kane: So you're always learning at Amazon because they're
just so picky about who they hire. And there were just
some amazing people there that you can learn new
techniques, new ways of thinking from, and not just in
engineering too, right? Also from the business side,
just being able to sort of absorb how Jeff Bezos thinks
in itself is hugely valuable as well. Right? So it was
tricky sometimes.
Kirill Eremenko: Got you. Did you ever get to meet him?
Frank Kane: Yeah, yeah, quite a bit. I mean, back then Amazon was
a much smaller company than it is today. So we were
all in the same building and you'd to find yourself in
the men's room next to him for all you knew. But yeah,
I had a lot of meetings with them and got to talk to
him quite a bit actually.
Kirill Eremenko: What was he like as a person?
Frank Kane: He's intense, but super smart. Definitely the smartest
guy that I've ever met in my life and that's saying a lot.
But yeah, just his ability to sort of analyze any
situation and just be right about it really quickly is
pretty admirable.
Kirill Eremenko: That's awesome. That's fantastic. All right. And so then
you moved to Orlando and you founded Sundog
software. Why the name Sundog?
Frank Kane: Oh, that's a long story. So it actually has nothing at all
to do with data mining, sorry data science or machine
learning. After I left Amazon, I had a noncompete
agreement like a lot of people do, so I couldn't really do
anything directly related to what I was doing an
Amazon. So instead I got into the field of visual
simulation, basically making a three-d simulations of
clouds and weather and oceans for simulation and
training products. So that's where Sundog Software
came from. A Sundog, if you don't know, it's actually a
atmospheric effect that is like a rainbow on either side
of the sun, under certain conditions. So since I was
building software that stimulates the sky, we kind of
drew our name from that because it was basically the
only thing that wasn't trademarked yet. So that is the
genesis of the Sundog. It wasn't actually named after a
dog.
Kirill Eremenko: Got you. Okay. And so you were providing, creating
this software for simulation and how did that morph
into online education? I'm always curious about these
stories, because so far nothing in your story even
flagged that you are going to be a super successful
online instructor. When did that transition happen?
Frank Kane: Yeah, I didn't see it coming either. So I mean, how did
it go down? Basically, after I quit Amazon and decided
to go on my own, I was kind of freaking out. Right?
Because I left behind these hugely valuable stock
options and stuff and I came down here with enough
money to get by for a while, but I was still pretty
nervous about it. Right?
Kirill Eremenko: Yeah.
Frank Kane: If you've never been self-employed before, it's a very
scary thing to jump into. So I started doing some
freelance work on the side to sort of supplement what I
was getting from selling my own software that I had
written. And one of those freelance gigs was actually
doing curriculum development for a company called
General Assembly in New York City. So they were
looking for someone to put together a data science
curriculum for an in-person training class that they
were putting together. So I did that and somehow,
because I had this Amazon pedigree, they plastered my
face all over their websites saying, "This course was
developed by an Amazon guy." And basically, so what
happened then was someone from Udemy was trying
to recruit new instructors in the field of machine
learning and data science. And they somehow found
me spelunking on the Internet and gave me a call out
of the blue and said, "Hey Frank, we're looking for
instructors on Udemy to teach big data and data
science topics, want to give it a shot?" And I'm like,
"Oh, why not? How hard can it be?" Right?
Kirill Eremenko: Yeah.
Frank Kane: Little did I know. It's actually really hard. But yeah,
that's kind of how my first online course came to be.
They reached out to me and I said, "Well, let's give it a
shot." And the funny thing is, the first course that I
made was really kind of a flop. The first month that we
put it out, it made like 200 bucks or something. I'm
like, "Well, all that crap." Well, we tried. But after
putting in so much effort into a course, I mean, as you
know, it takes many months to put all of these things
together, right?
Kirill Eremenko: Yeah.
Frank Kane: I didn't want to give up on it that soon, right? So I'm
like, "Fine, I'll try making another course and see if I
can sort of like build up on this and not give up quite
yet." And as a result of that, things started to actually
take off. So it was just sort of a hockey stick of growth
after that for a few years where you kind of have this
compound interest effect where you make one good
course and the students from that course, there are
people that you can sell your next course too and so
on and so forth. And you just keep building upon that
audience. Right? So that's kind of how it all
snowballed.
Kirill Eremenko: Very, very interesting. Yeah, totally, totally can relate
to that story. It's [inaudible 00:14:55] but I guess as
long as you have that inner drive or you get this feeling
of not just accomplishment but fulfillment when
somebody takes your course and feels that they've
learned something and that they can now use these
skills and especially if they tell you about it, if they
say, "Hey, Kirill or Frank, I took your course and I feel
empowered to do something in my job." Or "I actually
already did something with that knowledge and I
finished the project, I got a promotion or I helped a
colleague learn". It really gives you that additional
inspiration to keep moving forward and not to give up.
Would you say that you get that feelings as well?
Frank Kane: Oh yeah. There's so much to keep you motivated,
right? I mean, like you said, just that positive feedback
of how you're actually changing people's lives in a
positive way. I mean, what's not to love about that?
Linkedin has been great for that. Right? Like I'll see
people posting online, "Hey, I actually got this
certification because of you or I got this job because of
you, or thanks for your career advice on getting
interviewed at Amazon. Thanks to you, I actually got a
job." I'm like, "That's awesome." Everyone wants to
make the world a better place. Right?
Kirill Eremenko: Yeah.
Frank Kane: Yeah, so that's awesome. And also just the scope of
the impact, right? I mean, I had no idea there was so
huge of an audience for this stuff out there in the
world. And if you think about how many football
stadiums you'd have to fill out to put all of our
students in them at one time, it's some crazy number,
right? It's just hard to visualize even.
Kirill Eremenko: That's crazy. Yeah, I'm looking at your Udemy profile.
You have 248,000 students for those out there, it's like
almost a quarter of a million students. That's crazy,
one fit-
Frank Kane: Yeah, that's not new to me.
Kirill Eremenko: Yeah it's just-
Frank Kane: Then there's also Manning and all the other platforms
that we're on too. So it adds up to quite a bit.
Kirill Eremenko: Yeah, for sure. And so what I wanted to touch on here
is that like our area of expertise and area of where we
teach overlaps to some extent, but it's also slightly
different. So you mostly teach in the space of big data,
plus how it overlaps with data science, machine
learning. And that's what I wanted to touch on. With
the passing years since data science came around, big
data up here. These two have been kind of close and
also the relationship between them has been also
developing over the years. So can you tell us a bit
about that? How has the relationship between big
data, data science and machine learning on the other
hand, how has it developed over the past couple years
and what is it like now?
Frank Kane: Yeah, I mean, kind of my perception of it is that they
start off going in kind of their own directions, right?
And now they're kind of all starting to converge it
seems. I mean, that's kind of my high level take of it.
So originally when we started teaching data science, it
was all about messing around with Jupyter Notebook
on your own individual PC somewhere or individual
Linux host or whatever. And it's messing around with
smaller data sets. And to be fair, you can analyze a lot
of data on one machine if it's a beefy enough machine.
Then we have machine learning, which is off playing
around with the neural networks and stuff these days.
And you can still do quite a bit on a single GPU or a
machine with multiple GPUs.
Frank Kane: And then almost orthogonally, we have this world of
big data where people are using things like Hadoop-
based platforms like Cloudera or Apache Spark and
things like that to distribute the processing of data at
massive scale. And there's been these efforts to kind of
slap one on top of another. Spark has their Spark
MLlib library, for doing machine learning on spark.
Obviously, tools like Cloudera have tools for doing
large scale data analysis using their platforms. But it's
only recently I think that it's starting to converge.
Right? We have things the data pipeline on... Sorry,
the deep learning pipeline on Spark coming out where
you can actually do large scale machine learning and
deep learning on Apache Spark. So that's coming
together.
Frank Kane: We have TensorFlow being distributed on clusters,
that's coming together. So it seems like there's still like
10 different ways to do everything, but at least we're
starting to all come together at the same thing, that
it's not just about data science, it's not just about
machine learning, it's not just about big data, it's
about doing machine learning in a big data
environment.
Kirill Eremenko: Right. Why do you think now, why is the time now that
they're converging?
Frank Kane: That's a great question. I mean, I think it's just sort of
a natural process that's happening. There's definitely a
lot of interest in market forces, that are behind this.
But really I think it's just that these technologies have
all been maturing at a similar rate and now they're all
at a point where they're like, "Okay, how do we all get
together and do something even better together?"
Right?
Kirill Eremenko: Yeah. Okay. Fair enough. Fair enough. What has been
your favorite course to teach? What has been your
favorite topic to share with the world?
Frank Kane: Oh, I always have a soft spot for recommender systems
because that was kind of what I specialized in that my
time at Amazon. So if I had to choose one child that I
love the most, it would probably be my recommender
system course.
Kirill Eremenko: Okay. Got you. So you did recommender system at
Amazon, are you able to tell us a bit about that? To go
into a bit of detail about sharing any IP or sensitive
information?
Frank Kane: Yeah, I mean it was seven years ago when I left
Amazon. So everything that I can tell you is well
beyond the range of their nondisclosure agreements
because it's history at this point. Right? But there's
still some good stories about it that I-
Kirill Eremenko: Good, let's talk about it. Sounds like it's still a very
relevant and really cool topic and a lot of companies
really enhance their sales. Netflix, Amazon, online
marketplaces, they... Even Udemy itself, right? You
take a course and then you get recommended to other
courses on what to take. So please do tell us about
that. What was your role at Amazon? I mean, what
kind of recommender systems were you exploring back
then?
Frank Kane: Yeah, I mean, let's see. I mean, originally I was
working on things like people who bought also
bought... I actually ran the team for that for awhile. So
if you're shopping on amazon.com and you're looking
at specific items, there'll be a little widget that says,
"People who bought this also bought this." Or people
who viewed this also bought this. Or something along
those lines. So that was kind of like the heart of the
whole thing and this is all published publicly, so I can
definitely talk about it. So kind of like the main
component of doing any recommender system back in
those days, was this item to item similarities matrix.
Right? So we would take these vectors of everybody
that bought a given item, right? And make this two-d
matrix, I tried to find similarity distances between
every item based on what customers they had in
common. And by doing that you can create a database.
It's basically like, "Okay, here's item ID, whatever
corresponds to this book, and it is similar to this list of
other books sorted in order by similarity." Right?
Kirill Eremenko: Okay. Could you tell us a bit more about that. So how
is the vector created? What are the dimensions of this
vector?
Frank Kane: Well, it's a very, very sparsely populated matrix, right?
So the main problem of recommender systems is that
most people did not buy most items. So a given person
only bought a very, very small percentage of everything
that Amazon sells. Right?
Kirill Eremenko: Yeah.
Frank Kane: So, basically these are all sparse vectors that you
think of as a matrix, but when you actually get down
to the code of actually constructing that matrix it's not
really a two-d matrix. Basically you have customers on
one dimension and items on the other dimension,
right? And you just try to find how it's all interrelated.
Kirill Eremenko: Got you. Okay.
Frank Kane: Yeah, I mean that's kind of like the building block for
doing other cool stuff because once you know what
items are similar to other items, first of all, that's a
very permanent relationship, relatively speaking. So a
math book will always be similar to another math
book, is how we put it. These relationships aren't going
to change overnight. So you can get away with
computing that relatively and frequently. Right? And
once you have that, you can actually do things like
build up personalized recommendations by saying,
"Okay, here's the vector of everything that I personally
have liked either by buying it or looking at it or reading
it or something." Some indication of interest, I can go
out and get all the similar items that are similar to
everything that I expressed interest in, de-duplicate
those, score them and that becomes your personalized
recommendations. So that's what we call item-based
collaborative filtering, basically.
Kirill Eremenko: Okay. Got you. So that was back then. How have
recommender systems progressed now, in the courses
for instance, you teach these days, how are they
different?
Frank Kane: Yeah, I mean obviously the thing that's changed
everything has been the advent of deep learning, right?
So, now the modern way of doing it is to actually build
a big deep neural network. And again the challenge
there is getting a neural network to work with sparse
data. But Amazon for one has cracked that nut. They
have a system they've published called DSSTNE. You
can find it on Github, that does that and it works
really, really well. I was actually very impressed with
the results. But it's still hard to beat the old school
way of doing it. Item-based collaborative filtering still
produces great results. So while it is true that a deep
neural network can be a great tool for solving just
about any machine learning problem you can dream
up, these simpler approaches still give it a run for its
money.
Kirill Eremenko: Yeah. And also they're more cost effective I guess in
terms of computing power and time, to create and
things like that.
Frank Kane: Oh, absolutely. I mean, you'd be amazed how little
computing power we needed to actually produce those
item to item similarities, because it was all very highly
optimized code written in C. It was really, really tight.
But we used to really... A very Amazonian way of
thinking is to really favor simple solutions over more
complex solutions given the choice. Right? So given a
solution that will run on one system versus one that's
going to run on a hundred, if the end result to the
customer is going to be the same, we're going to take
the simpler solution because it's going to be easier to
maintain.
Kirill Eremenko: Yeah. Makes sense. So is that just a question of how,
it's not a same result if it's only 80% of the original
results and that's the question. Do you use the
simplest solution and get 80% of the results or do you
go for the more complex one, aim for the 100% of the
result? That's kind of the trade off, but probably-
Frank Kane: Yeah. I mean we definitely spent a lot of time trying to
squeeze every percentage of approving that we could
get out of it. Because it was such a huge lever. Right? I
mean, you can imagine, I think it's been published
that like 20% of Amazon sales was attributed to
personalization at that time. And that's not really the
real number, which I can't tell you the real number,
but that's the one that people talked about and it's not
that far off.
Kirill Eremenko: Yeah. It's crazy.
Frank Kane: But, yeah, when you have a lever that big, you think
about how many billions of dollars Amazon makes
every month, about one percent improvement is a
really big deal. Right? So if it really came down to a
more complicated solution will give us a 1% boost in
sales and yeah, we would do that. But generally
speaking, you didn't have to, you know what I mean?
The algorithms themselves can still be relatively simple
and you can still have a simple framework for blending
different algorithms together. So there are ways of
experimenting and trying simple changes and simple
solutions that will achieve those results.
Kirill Eremenko: Got you. And what would you say to somebody who
first of all, do you think any kind of business can
benefit from a recommender system or is it only just
B2C?
Frank Kane: Ooh, well, I wouldn't say any business can, but it's
obviously a useful thing. I mean, it depends mainly on
the size of your catalog, right? So if you like the New
York Times and you have like a jillion articles and
somehow they're all still timely, which isn't actually
the case. Great, a recommender system might help
people find content that's relevant to their interests.
Maybe a magazine would be a more of a relevant
example there, but if you're just running a little like
mom and pop ecommerce store where you're selling
five greeting cards that you've made by hand, a
recommender system isn't going to be helpful. Right?
You'd be better served just like manually, creating
those pairings, based on your human intuition than by
trying to get built some algorithms. It's not going to
have enough data to work with in the first place.
Kirill Eremenko: Okay. Yeah, I know. It makes sense, makes total
sense. Tell us a bit about the difference between when
you have a recommender system that looks at content,
like for instance, you as an individual, you consume
certain content or you purchase certain items and
then it looks at similarities between items to
recommend to you versus recommender systems that
look at your similarities as an individual to other
individuals. And then it looks like what purchases they
made, what content they consume and makes
recommendations that way.
Frank Kane: Yeah, I mean that's basically what we call a user-
based similar item. User base collaborative filtering as
opposed to item based collaborative filtering. So the
idea of user is collaborative filtering is that instead of
finding similar items, you find similar users by flipping
the problem on its head basically. And then you
recommend stuff that the similar users like that you
didn't indicate an interest in yet. That works too. The
problem is that people are more fickle than things,
right? So before I said that a math book will always be
similar to a math book. But Kirill might not always be
similar to Frank. I might go off and get interested in
astronomy tomorrow and say, forget about all these
data science stuff.
Kirill Eremenko: Which you are, which you are interested in astronomy,
which is really cool.
Frank Kane: That is my latest side hobby for sure. But still sticking
with the big data stuff for now. That's my day job.
Kirill Eremenko: Okay. And so people are more fickle and so therefore
it's harder to create those recommender systems, is
that what you're saying?
Frank Kane: I wouldn't say it's harder. It's actually exactly the same
technique. Just flipping the dimensions, one for the
other, but if the results aren't going to be as good, I
would pause it.
Kirill Eremenko: Got you. Okay. Is there any other types of
recommender system, in addition to the user base and
item-based collaborative filtering, more innovative or
newer experimental types of recommender systems
that you can share with us?
Frank Kane: Yeah, definitely. Before I forget though, on the previous
point, another downside of user based, collaborative
filtering, is that there's usually more users than things
in a given website. So you have a much greater
computing requirements to actually compute user's
similarities and items similarities.
Kirill Eremenko: Interesting. I wouldn't say that about Amazon though.
They have so many things that they sell. I guess it's a
bit debatable question-
Frank Kane: They do. Yeah, I mean, I actually don't know what
their current numbers are, but you're right. It's
probably not that far off at this point. They sell
everything you can imagine they can sell.
Kirill Eremenko: That's crazy-
Frank Kane: I think there's still more people interested in things
that they can buy-
Kirill Eremenko: And there's new things that are popping up. For
instance, I'm here, I'm in Bali right now and people
use this thing called Ali Express from China. I'm not
sure if it's related to Ali Baba or not, but then there's
also Alibaba, there's Ebay and Amazon seems to be... I
was thinking about this the other day, Amazon seems
to be very dominant in the US, Australia, now they are
in Australia as well. Some European countries, but
more in the Asia space, in the Asian market,
something that people don't recognize or realize that
there's these other players that are gaining so much
momentum that are growing so fast that there's some
countries here, where the people haven't even heard of
Amazon and yet they're shopping online, buying
everything. Even in China, what's it called? That
platform WeChat. I think if you can get anything on
WeChat. You can get a car wash through WeChat, it's
ridiculous. It's crazy how big these things have gotten
and yet we just simply don't hear about them for now
until they come and start disrupting the normal world
that we are used to living in.
Frank Kane: Absolutely. I mean, right when I left Amazon was when
they were trying to get into the Asian market a little bit
more. And I mean, it's been a real challenge for pretty
much every US tech company that I can think of.
Right? I mean, it's just a completely different political
climate, completely different culture. And unless you
partner with a big company that's out there existing
already, which is hard to do by the way, it's hard to
break in there for sure.
Kirill Eremenko: Yeah, yeah, for sure. All right. What about the different
recommender systems like new, innovative?
Frank Kane: Yeah. I mean, kind of the thing that evolved after
collaborative filtering was what we call model-based
methods. So basically matrix factorization. So the idea
is if you can think of the recommendation problem as
multiplying two matrices together, that's basically like
your matrix of interests as an individual by some
matrix that ties those interests to other things. That's
just another way of approaching the problem basically.
So we have things like a SVM that are used for that,
SVD rather. SVD plus plus is a specific variation on
SVD that's used for recommender systems that has
really good results.
Kirill Eremenko: What does SVD stand for?
Frank Kane: Singular value decomposition. So basically it's a
matrix factorization technique. But yeah, I mean that
was basically one of the winning approaches and what
they call the Netflix Prize a while ago, I don't know if
you've ever heard of that one.
Kirill Eremenko: Yeah.
Frank Kane: So Netflix put out this, I think it was a $1,000,000
bounty was it? If I remember right. For anyone that
could like make a recommender system that was, I'd
have to look at the number, but I think it was 10%
better than what they had measured by RMSC score.
And as I recall the winning entry actually used SVD as
part of their solution. It was actually more of a hybrid
approach. But that was part of how they did it.
Kirill Eremenko: Yeah.
Frank Kane: So that was kind of like the next generation of
recommender algorithms at that point. And after that
we entered the age of deep learning, right? So now it's
all about, "How do I use a neural network to solve this
problem?" And that's where we get into things like a
Amazon DSSTNE. And that's also how companies like
YouTube are doing as well. They published a really
interesting paper that details exactly how they're doing
their recommendations, using a deep neural network.
Kirill Eremenko: Why do you think they're not afraid to disclose their
intellectual property like that?
Frank Kane: Well, I mean, they're part of Google and Google's
always kind of had this open academia-friendly stance,
right? So I think it's mostly just a company culture
thing. Plus they realize that nobody has their data. So
one thing that I learned at Amazon is you can have...
The quality of your data matters way more than the
quality of your algorithm. In Amazon, if you know
everything that everybody's actually bought that
they've actually spent their money on, you're not going
to get better data on their actual interest in that.
Right? So having that powerful interest data to start
with, means that you can do pretty much anything on
the algorithm side and still get awesome results. And I
think the same is probably true of YouTube as well.
They actually know if you're actually watching a video
and for how long did you actually stick with it all the
way through and they can use that view data to
actually figure out what you're actually interested in.
Right?
Kirill Eremenko: Yeah. This ties into an interesting question, that
value... And this is for real business owners out there
and for heads of departments and executives. The
value is not in your algorithms, the value is in your
data.
Frank Kane: Right.
Kirill Eremenko: I find, still to this day, companies sometimes sit there
and think that they're going to create some miraculous
world changing algorithm. They're super protective of
it. They either patent it, or in most cases they keep it
as, from what I understand that they keep it as a trade
secret so that nobody even [inaudible 00:34:50] get
access to it. But realistically, we live in a world where
Google publishes more than one research paper per
day about machine learning, AI, computer vision deep
learning. So per day, That's crazy. So there's no way,
and that's all open sources. So Python-based,
predominantly TensorFlow or PyTorch for Facebook.
Those things are open source. You can go and
download them and there's no way you're going to beat
Google.
Kirill Eremenko: There's no way you're going to invent something that's
so bespoke that Google's never going to be able to
create that on their side. And it's just going to take so
much resources and effort from the perspective of a
small, medium, even large business. It's just much
easier to go out there, read these research papers,
track what you need, apply it. It doesn't matter that
it's open source because at the end of the day, the
value's not an algorithm, the value is in the data that
you have.
Frank Kane: Absolutely. I think another motivation for them to
share this research is from a recruiting standpoint too,
right? They want to get smart engineers out there,
learning about how to use their systems and get
excited about them and hopefully they can recruit
them to work at Google. I mean, that's ultimately their
goal. I mean, that's really the number one concern of
these tech companies. They just cannot hire enough
experts in these fields to meet their demand.
Kirill Eremenko: Yeah, yeah, totally. And for recommender systems,
we've seen this evolution that you kindly walked us
through on how they've changed. What I'm noticing is
that they're getting really good. They're getting crazy.
As a user, I go on Netflix and I... Something pops up
and I'm like, "Whoa, that's really cool. I didn't even
know that existed. So glad that I found out about this."
Or I give this example, I think couple podcasts ago
where my mom has a special relationship with
YouTube that she just doesn't even search for videos
herself. She just relies on youtube to recommend
things. And then she already knows she's going to love
it. And she just goes with the ball and just watches
whatever recommendation comes up. And so whenever
somebody else touches her iPad, she gets a bit
protective of it because because she doesn't want-
Frank Kane: Yes, I have a feeling.
Kirill Eremenko: My dad's interest in her youtube because that's going
to mess with her recommender system. So examples
like that illustrate that they've gotten really good, very
powerful and they know sometimes us better than
ourselves. What kind of future do you see for
recommender systems? Where's this whole space
going? If it's already that good, what can we expect to
appear next?
Frank Kane: Well I think you're right in that the algorithms aren't
going to get that much more better. Already I would
say that the difference in quality between deep
learning systems and some of the older systems or
matrix factorization are pretty minimal, quite honestly.
Really comes down to the quality of the data, like you
were saying. So the big leap forward is going to be as
people amass more and more of this data to learn
more and more about you. But now we're like starting
to get into this world of ethics, right? And privacy. So
it's going to be interesting times for sure. Because at
the same time, we don't want these... You don't want
YouTube to know everything about you necessarily,
but you still want good recommendations from
YouTube. Right? You can't have both.
Kirill Eremenko: Yeah.
Frank Kane: So, I'm not really sure how that's going to play out
right now. It's an interesting time for that.
Kirill Eremenko: What do you think of this notion? I was discussing
this with somebody, I think a few podcasts ago as well,
but I'd love your opinion on this, that 100 years from
now, privacy will be such a foreign concept. People will
be looking back on it and be just thinking, "Why was
this even a thing? What did privacy even mean?
What's the definition of privacy?" Because we're so
rapidly moving to a world where people, especially
millennials, are trading in their privacy and anything,
any information they have on themselves, trading it in
for better services, better products, better user
experiences. And that's not even a question to them.
Kirill Eremenko: So this whole privacy issue, from my conversations, I
see it as, I'm more of a... My generation, older
generations that that's a concern for us. But the new
generations are coming around, they don't really worry
about that stuff so much. So right now, yes, there's
some legal and struggles and barriers that are being
put in place, but there is a theory that in 100 years
from now there will be no such thing and everything
will be completely publicly available, fully exposed.
What do you think?
Frank Kane: Yeah, I mean I think like you said, the younger
generations are already there. They don't really have a
concept of what privacy even means, right? At least
online. They definitely want physical privacy still, but
online, it's not even a thing. It's not a concept. What
does that even mean to them? I don't know. So I think
we're already there to some extent, honestly. The
question is, what do we do with all that information
that people have given up? And if government started
abusing that information, to persecute people or
something, then people are going to care about privacy
real fast. Hopefully that won't happen. But the other
thing too is, we're using all this personal information
to... This is a very real problem right now, filter
bubbles, trying to create these echo chambers online.
Where we're using a lot of the same technologies that
we developed way back in the day to try to recommend
better books to you to figure out what are your
interests personally and how do we connect you with
more news and information and people and viewpoints
that are consistent with what you already like.
Frank Kane: This is how you end up in these online bubbles, right?
And that's very much a pressing issue right now. And
you have people quitting Facebook because they don't
want any more part of it. So that's what I'm kind of
talking about when I say, it'll be interesting to see
where this all goes. I mean, I myself quit Facebook in
January because of this stuff and I know a lot of my
friends have as well. So as for millennial they-
Kirill Eremenko: Tell a bit more about that, I didn't know you worked at
Facebook.
Frank Kane: Oh, no. I mean I meant I quit Facebook as a user.
Kirill Eremenko: Oh as a user. Okay, yeah.
Frank Kane: Yeah. I deleted my account.
Kirill Eremenko: Yeah. Got you. No, yeah, definitely some of these
things that are very controversial. Yeah, it'll be
interesting to see where it goes. But one question that
you might be able to help guide our audience in the
right direction is, if somebody wants to get into the
space of recommender systems, right? There's lots of
spaces in data science, machine learning, deep
learning that are, sorry and big data that are very
exciting. But I guess recommender systems is one of
those that is kind of on the verge of these converging
or on the overlap of these converging areas that we
talked about of big data. In recommender systems,
there's often these big data, there's a lot of data.
Kirill Eremenko: At the same time machine learning and data science, it
could be an interesting place for people to dive into if
they want to be in between these fields. So what would
your advice be for somebody who wants to get into
recommender systems, but doesn't have much
experience in the space? Zero to not much. Where
should they start? What should they look into? And in
general, how would you recommend going about
getting into this space of recommender systems?
Frank Kane: Well, I would say first and foremost to be a good
software developer. When I was at Amazon, we hired
software development engineers primarily. We didn't
really care what their specialization was, we just cared
that they were smart enough to write code and do it
well. And we figured if you can do that, you can learn
anything because this stuff changes every freaking
day. Right? So we didn't really focus on hiring people
for specific skills. Like in my case, they hired a guy
that did visual stimulation in video games and just
taught him how to do this stuff when he came in. So
step one is to be a good software engineer and maybe
that means Python, if you want to start off easy, that's
certainly, it's still a great choice, but just get proficient
in some sort of programming, if you aren't already.
Frank Kane: Beyond that, you're going to need some background in
linear Algebra. To understand the algorithms, you
need to have at least that level of mathematical
background to understand what's going on. Right?
And from there you can start to actually learn the
actual algorithms and techniques, either from my
course or a book or however you want to do it, or
online resources. Everyone learns different ways.
That's cool. And then you can actually start playing
around with small datasets, on your own PC. One that
I like to use is called the Movie Lens Dataset. I don't
know if you know that one. Basically they have...
Really? Yeah, go to a grouplens.org, I think it is. And
they have this a free Dataset of movie ratings that I
love to play with, probably because I used to work at
IMDB. So I have a soft spot for movies but they have
different sizes you can mess with.
Frank Kane: So they have like 100,000 reading data set and then
they have a 20 million dataset. And so you can work
your way up to bigger and bigger data, but you can
start just playing around, on small datasets, get a
sense of how these algorithms work, experiment with
them, try different ways of doing it. That's really what
it's all about. Just experimenting with different ideas
and different tweaks and different parameter tuning
and well, hyper parameter tuning I guess, that's the
technical term for it all these days. And then you can
think about scaling it up, right?
Kirill Eremenko: Yeah.
Frank Kane: So then you can start to think about how do I blend
this with tools like Apache Spark. If I'm going to be
using a neural networks, can I use TensorFlow to
distribute this across a cluster? That would be kind of
the final stage. And once you're at that stage, I would
say, start messing around and do some freelance
work. Prove that you can actually do this and build
something. And at that point you will probably be able
to find a job in this field.
Kirill Eremenko: So the jobs are there, people want to hire people for
recommender systems?
Frank Kane: Yeah, I mean that's just central to a lot of the big
technical companies out there, right? I mean, Amazon,
we talked about huge part of their revenue, YouTube
huge part of their views. Netflix, it's what they're all
about. Their entire company is about
recommendations, fundamentally. They're just built
around the whole thing. And a lot of people don't
realize that. Yes, I mean, deep neural networks are
hot, but really it's recommender systems that these
companies are built around and they cannot find
enough people who know this stuff.
Kirill Eremenko: Yeah, no, that's really great advice. Thank you so
much. At this stage I wanted to shift gears a little bit
and talk about what you mentioned just before we
started the podcast that at Amazon you were part of
the hiring and recruiting process. We'd love to learn a
bit more about that and maybe there's some tips and
tricks you can share for people to get hired at Amazon
or maybe even beyond that.
Frank Kane: Yeah, definitely. Yeah, so part of my duties at Amazon
is I was what they called a bar raiser. And this is
basically a role where you spend a lot of your time
doing interviews, both phone interviews and in-person
interviews, mostly in-person interviews. So whenever
there's an interview loop at Amazon there was one
person on that loop called a bar raiser that interviews
you and it's not necessarily someone that's in the team
you're interviewing with or even the same department.
Frank Kane: Their role is to sort of make sure that Amazon
standards for hiring are being applied consistently
across the entire company. So I was that guy. So it
meant that I had veto authority over every hire that
came across my desk basically. And I led all the hiring
discussions where we decided whether or not to hire
someone. Right? So a lot of influence there. And as a
result, I ended up interviewing over a thousand people,
I think while I was there or some crazy number.
Kirill Eremenko: Wow.
Frank Kane: Yeah. So as far as tips go for getting into Amazon, my
number one tip is to always think in terms of the
customer. It's not just lip service when Amazon says
that they're customer focused, it really does permeate
their entire culture. And anytime that you can tie a
question or a problem that you solved from the
viewpoint of the customer, you're going to get major
brownie points. All right. So anytime you're asked to
design a system, work backwards from the customer
experience, start with what will the customer get out of
this system? What did they want to see? What are
their requirements? How fast does it need to be for
them? Right? What results did they want to see? And
then figure out what technology you'd have to build to
deliver that experience. Don't start from the bottom,
don't say, "I know this cool algorithm and I would use
this cool algorithm and build it out and hopefully
customers would like it." That's the wrong answer. So,
always start with a customer experiences is tip
number one.
Kirill Eremenko: Great tip, great tip. What else?
Frank Kane: Well, you can go online and look for Amazon's
leadership principles. And customer obsession is
number one, but there's others as well. And I would
just encourage you to familiarize yourself with all of
those leadership principles. The other ones are
ownership, invent and simplify, write lot, learn and be
curious, insist on high standards, think big and really
internalize what these all mean and come up with
stories that you can talk about where you've exhibited
these qualities on your own. Because again, you're
going to get a lot of interviews with managers and a
bar raisers like myself who aren't necessarily part of
the team that you're interviewing, you're going to be
put on. And these are the things they're really looking
for. Do you fit with Amazon's culture and way of
thinking?
Frank Kane: Obviously you need to be technically competent as
well. It's going to be a very long and grueling day there
writing code on the whiteboard and solving design
problems on the whiteboard. So by all means, you
have to be ready to do that. You have to have really
strong coding skills, really strong systems design
skills. That's going to be the case for any interview.
But what's different about Amazon is they actually
care about what they say about their values and
principles that they live by. And you need to
demonstrate that yourself.
Kirill Eremenko: Very interesting. What would you say has been the
biggest mistake that you've seen recurring on the
entries that people make?
Frank Kane: Oh man, you'd be amazed. It's just like not knowing
how to code.
Kirill Eremenko: No way.
Frank Kane: Yeah. You'd be amazed how many, especially in phone
interviews, usually they get weeded out by the time
they actually come in the house.
Kirill Eremenko: Yeah.
Frank Kane: But we used to have a... Have you heard of fizzbuzz?
Kirill Eremenko: Nope.
Frank Kane: Okay. This is one of the interviews questions that we
use for screening out people and it's widely known, so
I'm not giving away anything secret here. The problem
is this, iterate through the numbers one through 100
and write code that if it's an even number of print fizz,
and if it's an odd number of print buzz or something
like that. I forgot the exact structure of the problem,
but it's just that simple, right?
Kirill Eremenko: There's no catch trick?
Frank Kane: No, that's it.
Kirill Eremenko: Okay.
Frank Kane: I'd say about 5% of the people couldn't do it.
Kirill Eremenko: No Way. That's like a five minute exercise.
Frank Kane: Yeah, yeah. You'd be amazed. So make sure you can
write code guys. That's my main tip. But beyond that,
just make sure you're well rested. A lot of people come
in kind of like low energy because they flew from
someplace far away the night before and didn't have
enough coffee or whatever. But you just got to have a
lot of stamina to get through the day, if you do come in
house. So make sure you're arrested, drink whatever
beverages you want to drink to stay alert and whatever
hack you have to do to make sure that you keep your
energy level up throughout a very challenging that
day.
Kirill Eremenko: Very interesting. So know how to code and keep your
energy up. I wasn't expecting those two tips as the
most common mistakes. All right. What would you say
is like the biggest, I don't know biggest advantage of
somebody who comes in for an interview if they have
this skill or have this experience, or can demonstrate
something that... They're almost right away. Everybody
knows, "Okay, this is the person." Have you ever had
that feeling, you see a person, you haven't interviewed
them much, but almost right away you can tell this
person is going to make a great addition to the team.
We definitely want them on board.
Frank Kane: Ooh. I'm always careful in those situations because
sometimes your gut is wrong. Right? I mean, human
brains are fickle things as I'm sure you know, now that
we know how to stimulate them to some extent. So I've
been a manager long enough to know that it is very
easy to make bad hiring decisions on someone that
looked great on paper or came across as very
charismatic. Right? You really need to separate that
charisma for how are they going to be able to interact
with your team? Are they got to be a "team player"
That doesn't have a huge ego to deal with, things like
that. So I've never been in a situation where like, "Oh
my God, I talked to this person once and we absolutely
have to hire them right now." But after two or three
interviews, yeah, there've definitely been cases where
I'm like, "We really got to get this person here. Pull out
all the stops, make them an awesome offer. Whatever
they want, give them twice that."
Frank Kane: But when it comes to stock grants and things like that,
they often had quite a bit of discretion as to what they
can offer people to get people that they really wanted.
Kirill Eremenko: Got you. And what has been the most common trigger
for you to use your veto power and not hire somebody
that maybe even others thought was kind of those?
Frank Kane: Yeah, I mean after doing that many interviews, you
kind of like learn what a good engineer looks like. And
I guess the thing that would probably give me the most
pause would be someone who pretended that they had
more experience and knowledge than they really did.
They're kind of a little bit deceptive on their resume. I
can uncover that pretty quickly. That's not cool. So
don't do that. Or someone who's coding ability at that
thought just wasn't up to stuff. Right?
Kirill Eremenko: Yeah.
Frank Kane: The main problem that... The reason that the bar
raiser existed is because there's huge pressure to hire
at Amazon or any big technology company because
there's just not enough good engineers in the world to
go around. And a lot of these teams are really
desperate to fill positions. That is their number one
goal is to just fill seats within their team and get more
engineers working on whatever they have to deliver.
And my role is to make sure that they don't get so
desperate that they lowered their standards. Right? So
that's what that's all about.
Kirill Eremenko: It's interesting, isn't it? That there's so many, as you
say, seats in the companies and they're just so eager
to hire people and on the other hand, we have such a
huge pool of candidates, so many data scientists,
engineers out there who want to get hired. It's just like
the bottleneck is that weeding out process and finding
the talented people, which there's plenty of as well, but
they're rare, right? Compared to millions or hundreds
of thousands of people who want to get hired. Those
hundreds or dozens of people that are really talented
and still also want to get hired. They really need to
stand out somehow for... If they had a beacon above
their head that, "Hey, I'm talented." You'd hire them in
a heart beat. But it's like it's not that case.
Kirill Eremenko: You have to go through this process. So is there
anything that's talented people whom I'm assuming
many of our... Listening to this podcast or most of the
people listening to this podcast are, you care about
their careers already by definition because they've
listening to career advice on these topics. Is there
anything that they can do to help recruiters such as
yourself or such as who you were back in your past life
of Amazon to identify them to make that whole process
easier and that match happen faster?
Frank Kane: Yeah, I mean, it's like you said, you've got to build that
beacon above your head, right? So here's the reality of
of the situation. Everyone applies to Amazon and
Google and all these big companies and they don't
even look at the resumes that are submitted to them
because there's just so many of them. And weeding
through the mall is impossible. Instead, they will come
to you, right? So you want to make sure that you've
done something that's going to catch the attention of a
hiring manager or recruiter at the company that you
want to get hired at. One way to do that is to know
somebody, right? So if you know somebody who
already works at the company you want to work with,
oftentimes they get referral bonuses, if someone that
they recommend gets hired. And that's probably the
best way to get your foot in the door.
Frank Kane: So really scour your social network, scour Linkedin,
see if you know anybody or if you have a friend of a
friend at the company that you want to get into
because that might be your best way to get noticed.
But beyond that, if you don't, make sure you're
winning coding competitions. Make sure you have stuff
on Github that people can find, get published. Put out
a blog, make sure you're on LinkedIn and having the
right keywords that they're looking for there. Because
the recruiters are looking for you. They're not waiting
for you to come to them. Right? Beyond that, I mean,
obviously the more traditional channels like college
recruiting is an important source of new hires for
these companies as well.
Frank Kane: Career fairs and stuff at colleges or obviously, if you
graduated from Stanford, you're probably going to get
a call from all of these people, right? But not everybody
can afford to go to Stanford. So for everyone else, you
just have to make sure that your profile stands out
online and your accomplishments are easy for them to
find.
Kirill Eremenko: Fantastic. And I just want to add to that, that in the
process of you putting up all these things online,
whether on GitHub, on Medium, blog posts, videos,
whatnot, you're going to make connections, right?
People who already at Amazon, they're not just sitting
there and wiggling their funds and just doing Amazon
work or whatever other company they're in. They also
go out there and they also read, they also want to
know new, what's been happening in the competition
space, what's new on GitHub, what's new... a
recommender system that somebody is exploring. So
inevitably the more stuff you put out there, sooner or
later somebody from Amazon's going to read it and
they might ask you a question and then you talk to
them and then you can build that connection.
Kirill Eremenko: So you don't have to just go and put yourself as target.
I have to know somebody else. And even if you do like
as Frank, which you said, if even if you do this part of
just building your online presence, eventually you'll
build these connections in a very natural way. And
sooner or later somebody from Amazon or Apple or
whoever else you want to get into is going to come
across your way. So yeah, these two come hand in
hand and there's a self fulfilling prophecy as long as
you invest the time and effort and energy into it.
Frank Kane: Yeah, I agree completely. I mean everybody at these
companies are invested in hiring. It's not just the
hiring managers and recruiters and if they come
across something you've done online and they like it,
they very well may reach out to you. So you're
absolutely right.
Kirill Eremenko: Fantastic. Well thanks a lot Frank. We've slowly come
to the end of this podcast and super pumped about
the chat that we had. Before I do let you go, please tell
us a couple of places where our listeners, our audience
can follow you, get to know you better and see what
new things you'll get up to in the coming months and
years.
Frank Kane: Yeah, I mean, if you want to check out what I'm up to,
you can head to my website, which is a sundog-
education.com. And from there you can follow me on
whatever social media you wish. And also you'll find...
we've got to give a tip of the hat to Manning
Publications at manning.com and you can find my
couple of new courses from them under their live video
tab there. The Elasticsearch 6 and The Ultimate
Introduction to Big Data are found there.
Kirill Eremenko: Fantastic. And is it okay for our audience to connect
with you on Linkedin as well?
Frank Kane: Absolutely. The more the merrier. So bring them on.
Kirill Eremenko: Fantastic. Awesome. Well, Frank, thanks so much.
One last question I have for you today is what's a book
that you can recommend to our listeners that's
changed your life?
Frank Kane: Ooh, that's changed my life. The most recent one that I
read is a big thick book called... let's see. I have it right
here, Recommender System Handbook. And it's
basically a huge collection of papers from various
researchers in the field including Netflix and people
like that. So as I was preparing my recommender
system course, that was a hugely valuable resource for
getting caught up on the current state of the art. And
for someone new to the field, I think it's sort of
required reading for figuring out what's out there and
getting a broad lay of the land of the techniques that
are being used today.
Kirill Eremenko: Awesome. Is this by Francesco Ricci? [crosstalk
00:58:38] up on Google.
Frank Kane: Yeah, it's published by Springer.
Kirill Eremenko: Springer, yeah. Published by Springer. Yeah, I found
it-
Frank Kane: Yes, I'll pull it out here at my bookshelf, yeah,
Francesco Ricci, that's right they're the editors.
Kirill Eremenko: Okay.
Frank Kane: It's not cheap but it's worth it.
Kirill Eremenko: Yeah, definitely. Best things in life are sometimes it's
free, sometimes you got to buy them and then they'll
change your life. Okay. On that note, Frank, thanks so
much once again for coming on the show and sharing
all the insights and knowledge. Really cool chat and
yeah, catch you soon. Maybe at Udemy Live this year.
You're going?
Frank Kane: No, I'm not going this year, but definitely next year.
Kirill Eremenko: Okay, no worries. We'll catch around then. Thanks so
much for coming on the show.
Frank Kane: All right, good talking to you.
Kirill Eremenko: Thank you ladies and gentlemen, boys and girls for
being part of this conversation. My favorite part was
about the convergence of data science and big data.
It's very interesting how these two fields are becoming
more and more intertwined. And of course there were
plenty of other great and useful insights throughout
the podcast. A huge shout out goes to Manning
Publications, which are hosting some of Frank's
courses. So you can find Frank either on Manning
Publications or on Udemy, and if you haven't taken
any of his courses yet, highly recommend checking
them out, especially if you're interested in getting into
the space of big data after today's podcast.
Kirill Eremenko: As usual, you can get all the show notes at
superdatascience.com/265 that's
www.superdatascience.com/265. There, you'll find all
of the resources, materials that were mentioned on
this episode plus the transcript for the episode. And
plus of course, any links to Frank's social media where
you can get in touch with him, you can follow his
career or simply check out his courses. On that note,
thank you so much for being here today. I am very
grateful that you're part of the SuperDataScience
podcasts and the SuperDataScience journey, and the
community that we're building. If you don't, if you're
not aware of yet, then we actually just launched a
slack channel for SuperDataScience members.
Kirill Eremenko: So if you're a member at SuperDataScience, you must
have gotten an email. Make sure to join that slack
community that we're building, it's not just one Slack
channel, it's actually a multitude of Slack channels in
a Slack community, where you can chat to each other,
to me, to instructors and if you're not a
SuperDataScience member yet, then make sure to
check out superdatascience.com where we're adding
new features all the time. On that note, thank you so
much and I'll look forward to seeing you back here
next time. Until then, happy analyzing.