Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
Kirill: This is episode number 93 with Data Scientist at Liquid
Biosciences, Beau Walker.
(background music plays)
Welcome to the SuperDataScience podcast. My name is Kirill
Eremenko, data science coach and lifestyle entrepreneur.
And each week we bring you inspiring people and ideas to
help you build your successful career in data science.
Thanks for being here today and now let’s make the complex
simple.
(background music plays)
Hello everybody, and welcome back to the SuperDataScience
podcast. Today we've got something very special prepared for
you. Today on the show we had Beau Walker. And we just
finished our episode, and now I'm super enthusiastic about
everything! The amount of things that he shared is crazy.
And very interesting things at the same time. So Beau's got a
crazy background, and we'll talk about that just in a bit. It
feels like he's had like five different careers. And what he
does now for a job is a specific type of data science, which is
evolutionary programming based machine learning. And the
description he gave is intense. It's like when they create this
environment for algorithms to evolve on their own, or models
to evolve on their own, just like we had in evolution, when
animals evolved. So how they reproduce, how they fight with
each other, survival of the fittest, and things like that.
So it's a very, very interesting space. I had no idea. I had
some interaction with people in a similar space, but I had no
idea it was so evolved, and exactly what it's all about. So this
was going to be very exciting for you to check out. Also, we
talked about patents and trade secrets. So Beau, apart from
being a data scientist, he also has a degree in law, and
specifically in the area of patents and trade secrets, so that
can be very useful.
And of course, we talked about his journey, how he went
through all these different careers, what he experienced,
what he felt, and what choices he made down his career
path. So a very exciting episode ahead. Can't wait for you to
check it out. And without further ado, I bring to you Beau
Walker.
(background music plays)
Welcome everybody to the SuperDataScience podcast. I've
got a very exciting, and very interesting guest today, Beau
Walker. Beau, welcome to the show. How are you going?
Beau: Thank you Kirill. I'm doing great. Excited to be here.
Kirill: Where are you calling from?
Beau: I'm calling from Orange County, California.
Kirill: That's so cool. How's the weather in Orange County?
Beau: It's beautiful, 75 degrees, sunny.
Kirill: What's 75 degrees in Celsius?
Beau: Oh, let's see, now you're making me think on the spot. Let's
see, twenty-something? Yeah. I don't know!
Kirill: 29? 25?
Beau: I lived out of the US, and dealt with –
Kirill: There we go. 23.8, right?
Beau: Yeah, there we go.
Kirill: Ok, well nice, nice. You guys are slowly headed into winter,
so it should be pretty full.
Beau: Yeah.
Kirill: So Beau, we got in touch through a common connection,
through our dear friend, right? Who's our friend?
Beau: Yes, Ben Taylor.
Kirill: Yeah. So how do you know Ben?
Beau: Ben and I are from the same place. I'm from the state of
Utah, in the US, originally. Ben lives there now. We've been
connected on LinkedIn for a number of years, exchanged
back and forth. And his current co-founder, we actually
graduated from the same programme at BYU, the university,
but not at the same time. So we just have a lot of common
connections.
Kirill: It's interesting how the world works, right? I also know Ben
Taylor from LinkedIn, also for a number of years, and none
of us, neither you nor I, have met him in person, and yet
with him, we've already talked about so many things, and
now he's connected us, and it's for everybody listening out
there, it's such a powerful world these days, that especially
through LinkedIn and online, you can create some amazing
connections and friendships. It's really cool.
Beau: Absolutely true.
Kirill: Awesome. So, Beau, you've got like a crazy story. You've
done consulting, you've done marketing, you've done law,
you've done data science, you've done biological things. I
don't even know where to get started. I'll probably pass it
over to you.
Beau: In the past, when recruiters have looked at my resume,
they've been confused.
Kirill: To say the least, yeah? At the least, confused.
Beau: Yeah. But there is a common story, and I have a lot of varied
interests, but there is a common thread. And I feel like, at
least personally, that my background lends itself very well to
data science. Data science is still a young profession, at
least by that name, and it's often a profession where the
rubber meets the road between data and actually making it
translate into something useful for business. And so, a
varied background, a different background, can be very
useful.
So my background is that my dad's a marketing guy. He's
been in marketing his whole career. He's also an inventor.
He has a number of patents. And just one of those guys,
entrepreneur, inventor, always talking about ways to make
things better, or new inventions, and that really stuck with
me. And when I went into university, I've always had a really
strong interest in technology and science, and I went into
biology. And pretty early in my undergrad, I got involved
with some great labs doing research with some great
professors.
But at the same time, working to pay my way through
college, I was doing marketing on the side. I went and got a
Masters degree with the same professor that I had worked
with in my undergrad in ecology, and that's really where,
doing research with him, I first got exposed to doing science,
the scientific method, gathering and analysing and
presenting data, which is really the core of data science. My
Masters thesis and Masters work required a ton of
programming. I was in an ecology and evolution programme,
and so a lot of those skills I had to pick up myself.
Kirill: Yeah, I can imagine. That's some tough stuff.
Beau: Yeah, so I was looking at erosion in the desert, the US
Southwest desert, and I was developing photogrammetry
methods to use images from small digital cameras, local
images, and calculate the amount of erosion that was
occurring at a small scale.
Kirill: Sorry, erosion is when the desert takes over green areas? Is
that what it is?
Beau: So that's when wind or water moves soil or sediment away.
That's erosion.
Kirill: Ok. So it becomes more desert?
Beau: Yeah. In the US Southwest, it's a huge problem because one,
you get these huge dust storms that blow through and lose
visibility and cause car accidents and things like that. And
then from an ecology standpoint, there's all these nutrients
for plants and stuff that get blown away. The dust from
these deserts actually lands on the Rocky Mountains and it
causes the snow to melt up to 30-40 days earlier than it
usually would. So, there’s all kinds of change in the
ecosystem because of this sort of thing, and people make the
dust problem worse by riding ATVs or grazing animals.
That was kind of the context of what the research was, but
the part that I really liked is I was developing these new
methods that no one had used before, you know, taking
images and creating 3D models from them, and then over
different time periods calculating the difference in volume
that had been lost along with a bunch of other data we were
collecting. That required a lot of programming in MatLab, in
R, and in Python. And I think that’s where I really started to
develop my data science skills, when I was working with
large biological datasets, was generating a lot of data.
And about that time, I got a job at a marketing consulting
firm as their data scientist. That’s when I first started taking
these scientific skills that I had learned and employing them
in marketing and advertising. And instead of analysing dust,
which my wife assures me is a very boring subject, I was
analysing social media data, web analytics and sales data
and stuff like that, helping develop predictive models and
really analyse the effectiveness of different campaigns. That’s
how I made the transition from my Master’s into data
science.
Kirill: And just before we continue, how did that feel? Which one
did you prefer more? How did they compare, you know,
using science in science or using science in business?
Beau: You know what? They both are really exciting to me. My plan
early on was to go get a PhD. I didn’t end up doing that, and
I think the reason was I love science, but especially in
ecology, it’s hard to feel like what you do has an immediate
impact. You know, in the U.S. there’s all kinds of legislation
and other stuff like that. It can take a really long time and
people to even listen to your research, for a change to
actually happen. And I feel like on the business side—you
know, sometimes the results are immediate. So I think that,
and having grown up in an entrepreneurial marketing
context, I was drawn to that. But for me, I felt like I was
doing science in both cases, you know. One purpose of
science is to uncover and understand the underlying laws of
why things are the way they are. In marketing and business,
there are laws.
Kirill: Yeah, I totally understand. I can completely relate to that
concept of immediate results. It’s very rewarding to see your
work actually bring some sort of change very quickly.
Beau: Yeah.
Kirill: Okay. And then what happened after that? I’ve got a feeling
like we’re just getting started here on this. And just a quick
note for those listening, Beau gets asked all these questions
so many times that he even wrote an article “There And
Back Again: My Return to Science.” So you’re kind of
walking us through this article, right? Through the main
points, but in more detail?
Beau: Yeah, there’s a nice “Hobbit” reference for those Tolkien
fans. So, it was during the recession in the U.S. and I had
been planning after my Master’s degree, originally going to
get a PhD, but then maybe feeling I didn’t want to go into
academia but, you know, I knew that I’d loved inventing
things and loved business, so I started thinking about a
career that—now I feel kind of stupid that I didn’t
immediately grab onto data science, but I started thinking
about a career where I could combine my love of science and
my love of business. And looking back to my dad having got
a bunch of patents, I decided, talking with a bunch of patent
attorneys, that I wanted to go to law school specifically to
become a patent attorney.
Kirill: Interesting.
Beau: So I went and I got the wrong doctorate for data science, a
more professional degree – a JD or Juris Doctor. (Laughs)
Kirill: And you got it very quickly. It took you like 3 or 4 years,
right?
Beau: Yeah. In the U.S. it’s a professional degree program and it
only takes 3 years.
Kirill: That’s really impressive. So you have a Master’s in biology
and a doctorate in law.
Beau: Well, yeah. I think PhDs in the U.S. wouldn’t consider JD a
doctorate, but yeah, it does have ‘doctor’ in the name. But
I’m not Dr. Walker.
Kirill: (Laughs) Okay. And where did that take you?
Beau: So, fairly soon into law school, I started to realize how much
I missed science and how much I missed programming.
Kirill: Because you got none of that in the law school, right?
Beau: No. And, in fact, the way that law works is truth is all
relative. In science it’s a lot more, you know, you have the
data to support your conclusion or not, and the law is you
can convince the judge or not. Or the jury. So that was
always kind of uncomfortable for me. So I started to take on
some freelance clients, doing stuff on the side while I was in
law school. I hope my contracts professor isn’t listening to
me, but I was the kid that he’d look over and I was always
programming on my laptop instead of taking notes. (Laughs)
Kirill: And how did you find the clients, just out of curiosity, was it
some website online or somehow?
Beau: A combination of personal connections that I knew from the
work that I’d done before. Sometimes I’d get clients off of
freelance sites like Fiverr or Upwork.
Kirill: Yeah, I always recommend Upwork to people. It’s a very good
website for that sort of stuff.
Beau: Yeah. And then I had a couple of my own projects that I was
working on. A couple of months into law school, I started
working for a law firm, intellectual property law firm, and got
a ton of experience drafting patents, litigations, doing
trademarks, all of that. I worked mostly full-time, 20 to 40
hours a week all throughout law school there. And it was a
great experience because the firm where I was at had really
great training, and I got to do all the things that I would do
as an attorney and really good experience. You know, I
drafted a ton of patents in biotech a lot and software and
data science areas, but frequently we’d have inventors come
in and I was jealous that they were doing all these cool
things and I was just writing about it. So that was just
always in the back of my mind, like, “They’d invented this
really cool thing.” And sometimes patent attorneys play a
small part in helping make it a little bit better, but I’m
jealous that I’m just writing about the stuff that they’re
doing.
Kirill: But jealous in a good way, like it pushed you to change,
pushed you to realize things about yourself?
Beau: Yeah. I actually was contemplating, like, “Can I jump back
into data science full time?” I kind of tested the waters a
little bit. And it was hard while I was in law school, I was
committed to finish that out. But almost a year after I
graduated law school, I had my current boss reach out to
me. He found me on LinkedIn just out of the blue, you
know—and we can talk about LinkedIn, but that’s maybe a
whole other podcast. (Laughs) I’m a big fan of LinkedIn. He
said they were looking for a data scientist and he really liked
my background. And I got to talking to him and I was really
fascinated. He had bioanalytics company, clients were like
pharmaceutical companies and health care, and they had
their own form of machine learning that was evolutionary
programming-based. My Master’s degree is in evolution, so
that was really interesting to me.
So I decided to leave law and to join that company. That
kind of brings me to where I am now. I’ve done a lot of
freelancing for various companies of different sizes,
everything from marketing to sensor companies, and now
I’m the data scientist for a biotech/bioanalytics company, so
I’ve kind of gone full circle.
Kirill: That’s really cool – a bioanalytics data scientist. And it’s an
amazing story how you went there and back. You say that
on one hand you really miss data science, and I think that it
was probably a necessary step in your career to go away
from data science. We realize how much we miss things only
when they’re absent, like the saying, “Absence makes the
heart grow fonder.” And at the same time, I’m sure there’s a
couple of things that you probably picked up in this law
degree that you were doing that you now use in your career.
Could you mention something? For somebody who’s a
lawyer, or studying law, out there and listening to this
podcast, what is some skill or habit or something that you
picked up during law that you’re still applying in data
science?
Beau: Oh, absolutely. There’s a couple of things. One, lawyers are
trained to be very good at seeing and even being able to
argue different sides of the same issue. And, you know, a lot
of times when we’re analysing something in data science, it’s
not entirely clear immediately what the data are saying. And
sometimes you have to be open to “Maybe the algorithm or
method that I’m using is telling a misleading story and I
need to look at it from another angle.” So that’s one thing
law taught me.
The other thing, I think a huge part of a data scientist’s job
in many companies is kind of being the gap between what
the data are saying and what that actually means in terms
of what the business should do. And those communication
skills are something that being in the legal profession
definitely helped me with in terms of being able to
communicate complex subjects. So that’s another thing.
And the third thing, the area of patent law is really
interesting, especially if you’re drafting patents, because like
data science, it’s a profession where you’re always learning.
You’ll have an inventor come in and he may have invented a
new way of designing or using an oil rig. So then you have to
do a bunch of research on all the nuances of how oil rigs
work. And then your next client will have invented a new
database schema or something, so you have to become
enough of an expert, or if not an expert, conversant enough
that you could draft a patent in it. You know, just the ability
to quickly come up to speed on whatever the topic is
immensely useful in data science because the field is always
changing, always encountering new problems that maybe
aren’t exactly like what you’ve encountered before. So the
ability to know where to turn to find the information you
need is really important.
Kirill: Yeah, definitely. It’s a skill that you can’t just learn
overnight. It’s something that you have to practice, practice
and practice. Those are some solid skills that you took away
from your law degree, and I’m sure a lot of people will find
this useful. Okay, now you are passionate about inventing,
right? You have, what, 20, 30-odd patents and trade
secrets? Or is it more than that?
Beau: Yeah, I’ve always kind of had a bunch of ideas. And what’s
really exciting about my current role is that the company I’m
at places a huge value on intellectual property. So I kind of
have a dual role. I’m there as a data scientist primarily, but
heavily involved with our IP strategy as someone who is
familiar with that world and help manage our outside
counsel. Since I’ve been there, we’ve started filing a ton of
patents, I’ve invented a number of things. It’s just been
really fun. You know, everything from new ways of analysing
biological samples to new unsupervised learning clustering
methods and stuff like that. It’s just been really fun to have
that creative side.
Kirill: That’s really cool. And it sounds like they hired you as a
data scientist, but now you’re doing not just data science,
but also the patent side of things and helping out with that.
Can you tell us a bit about that, because that sounds like a
very interesting career move or career development stage
where you came into the company to do one thing—and
correct me if I’m wrong, maybe they hired you right away to
do both things. Can you tell us about how your role has
developed in these past seven months that you’ve been
there?
Beau: Yeah. We have the benefit that the company is smaller. You
know, if you’re in a larger company, you maybe don’t have
that flexibility. But one thing that was really attractive to me
about this role is that there is a lot of opportunity for me to
help shape the company. They had invented a number of
things before I joined the company and they’d filed, I think
one patent. And when I joined the company I said, “Hey, you
know, I’ve just spent the last three years around patent
attorneys. We should be approaching this differently. You
have a ton of value here and you bring incredible value to
the company. Let’s start filing some of these things.” So we
filed a bunch of patents since then, both on the old stuff and
new stuff that we’ve come up with. It’s been me not being
afraid to say, “Hey, I have experience in this and this would
be useful,” and then having the data to back up and say,
“This is why it would be useful,” it’s kind of the same skills
of advocacy that are useful.
Kirill: Awesome. And that sounds like a great example to everybody
listening that there is room always to leverage your existing
skills. You’re going in as a data scientist, but you have
interest in something else, you should express that interest
to your manager, to your boss, to other people and
proactively work towards making that happen, making your
role shift in that direction or expand in that direction.
And because you’re so passionate about it, you’re inevitably
going to be happy doing it and you’re going to be bringing
even more value to the company so people are always going
to be open or should at least always be open. And that’s for
the managers out there, to be open to suggestions like that
because it’s a win-win for everybody. That’s a great example
of that. Awesome!
Beau: Yeah, I think that’s good advice.
Kirill: Yeah. And thanks for sharing that in your story. You
mentioned a couple of tools that you used in your degree –
MatLab, R, Python. What are you using predominantly now?
Beau: In my day-to-day, I primarily use R just because the data
manipulation and visualization tools are really great. The
bulk of our machine learning is done with our proprietary
software that we’ve coded that’s in combination with some
other languages. I do the bulk of my day-to-day in R with
some Python.
Kirill: Yeah, gotcha. And which one do you prefer, R or Python?
Beau: Probably R because I’m more familiar with it. I’ve use it a lot
more. But Python is great and it’s quickly in my mind
gaining all the benefits that R has.
Kirill: Gotcha. This evolutionary programming-based machine
learning is very interesting. I think we’ve had a guest before,
Deblina Bhattacharjee, she was in evolutionary-based
artificial intelligence. Without revealing any trade secrets or
patents, can you give us just a general overview of this topic
area? What is evolutionary programming-based machine
learning?
Beau: Yeah. I can give you a very general use case of how it works
with what we do.
Kirill: Yeah. That would be great.
Beau: There is this idea, a term called ‘inverse problems.’ An
inverse problem is one where you don’t necessarily know
what the problem is. You maybe just have data about it. You
may have a set of predictors or data about circumstances
and data about outcomes. And there’s some mathematical or
statistical model in the middle of those that relate the
predictor the outcomes. That’s kind of generally the goal of
machine learning.
But specifically, in biology and in human health, there are
mathematical rules that govern disease and other things like
that, but we don’t necessarily know beforehand what those
relationships are. One way that we approach that problem is
through evolutionary programming. And the idea, and the
way that our software works, is that we start off randomly
generating millions of models, mathematical models or
algorithms that comprise math operators like addition,
subtraction, division, sine, cosine, constants and then n
variables, any of the variables in the dataset. We put them in
the digital ecosystem and then let them evolve.
At the beginning, the algorithms are very bad, they predict
the outcome very poorly. But you’ll have some that maybe
instead of getting a coin flip 50/50 chance of predicting the
outcome, there will be 51%. So that algorithm will kill off
other algorithms and then they’re allowed to either mutate,
duplicate themselves or mate with other winning algorithms.
Kirill: (Laughs) This is so cool.
Beau: Yeah. So you go through multiple generations, and there’s a
lot more specifics of how to get this to work, but what we
end up with is a predictive model that’s evolved to the
dataset to predict the outcome of the problem. There’s a
couple of questions that we always get. One is, “What about
overfitting?”
Kirill: Yeah. You were reading my mind. I was just sitting here
thinking that.
Beau: Yeah. That is a concern for any method, but evolutions can
be really good about overfitting. And the bulk of our IP, or if
not the bulk, a good portion of our IP deals with this issue,
so there’s a couple of ways that we deal with that. One is to
make sure that we have training validation and test sets.
There’s a number of ways to deal with that, but what we end
up with is a model that is small, typically models that we
produce out of our process are between 5 to 15 steps and
they have maybe anywhere from 3 to 9 variables and a
couple of different math. And they’re very robust. They hold
up really well in terms of sensitivity specificity or area under
the curve or whatever across different out-of-sample
datasets.
The reason that this is a really powerful approach in health
care is that often in medicine, it’s not just enough to predict
an outcome. You need to know why and you need to know
the underlying mechanism. Using this approach, we can
take datasets that have millions of variables per patient and
bring that down to the 3 to 9 most important biomarkers or
whatever. It’s really powerful.
The other benefit is that a lot of deep learning/machine
learning techniques are very data-hungry, but in health care
and pharma, you often have datasets where you have an N
of 60. You may have a couple million biomarker variables
per patient, but it’s only 60 patients deep. Evolution can
deal with that, though. That’s kind of the general principle of
what we do as work. There’s a lot more detail in it, but as
someone who came – at least academically – from an
evolutionary background, I’ve always been intrigued by
evolutionary programming methods, and they were initially
very popular when they first came out, kind of like neural
nets were, but ran into a number of problems in terms of
implementation, you know, the hardware wasn’t ready, it
was very computationally intensive, and there were a
number of issues with implementation. What we say at our
company is, deep learning neural nets went through this
where the hardware finally caught up and there were a
number of key innovations in terms of implementation and
that’s why they’re performing so well today. I feel like the
same thing is happening for genetic
programming/evolutionary programming.
Kirill: That’s fantastic. Thank you for such a good overview. And I
really liked what you mentioned towards the end, that it’s
important in medicine often to know what is the reason for
certain outcomes and that your algorithms are therefore
interpretable, that you can figure a way that you can get
those variables out of it.
Beau: Human-readable, yeah.
Kirill: Yeah, and also that your algorithms in some ways beat deep
learning, especially in the sense that deep learning is data-
hungry, right? (Laughs) This is probably something that you
talk to Ben Taylor about sometimes.
Beau: Yeah. (Laughs) Actually on LinkedIn I tag him in some posts,
too. Deep learning is really great at some very specific
things, but there’s a lot of use cases where, just like with
any method, it’s not the right tool.
Kirill: It’s good that this alternative exists, right?
Beau: Yeah.
Kirill: I’m especially very happy to share this with our listeners
because sometimes all you hear is deep learning – especially
if you talk to pioneers in the deep learning field, you just get
that deep learning can beat everything. But what if you don’t
have enough data? What if you have, like you say, 60 data
points? Well, apparently there are other ways, such as
evolutionary computation or evolutionary programming,
machine learning and AI which is really, really cool.
Beau: Yeah. I mean, it’s tempting to look at data science as a
profession and feel like every situation where you use data is
like Google or Facebook. That hasn’t been my experience.
You know, I think they’re the most prominent examples, but
they’re sitting on amounts of data that most industries can’t
even dream of having. Their problems are different. Data
science is just as important in those other industries, even
with less or different types of data.
Kirill: I totally agree. It’s so exciting. What you’ve created seems
like a model of the real world, but probably on steroids,
meaning it evolves really quickly. But algorithms killing each
other, mating with each other and duplicating themselves?
That’s crazy. I can’t imagine how much fun you’re having at
work.
Beau: It’s really cool. And it’s cool to use my ecology and evolution
background because it’s been incredibly useful. And I’ve
kind of put that on the backburner in data science over the
past couple of years and coming back I’m like, “Nature has
been doing things for billions of years for a reason: It works.”
(Laughs)
Kirill: Yeah, I totally get it. And was it hard to recall all those skills
from your biological background? You know, because you
put it on the backburner for some time, was it hard to
reinstate that?
Beau: No, because I’m really passionate about that type of stuff.
And I think anyone who is connected with me on social
media gets sick of the biology-related stuff that I post or
“This cool article on evolution,” you know. They’re like,
“There goes Beau again.” Most people don’t care about that.
(Laughs)
Kirill: You kept up your passion, even while you were doing law
and other stuff.
Beau: Yeah.
Kirill: Yeah, that’s important. Okay, that’s really cool. Thanks a lot
again for sharing that. I wanted to go to your patent
background. I think we’ve never had a guest who specializes
in patents and trade secrets and it would be criminal of me
not to ask you some questions about something there. First
of all, what’s the difference between a patent and a trade
secret?
Beau: In the U.S., because that’s what I’m familiar with—well, I
have to also probably legally say I’m not an attorney.
(Laughs) I just worked at an IP firm, I have a JD.
Kirill: This is not legal advice, everybody. Please consult your
attorney.
Beau: Yeah. So, in the U.S., trade secret is something that—a
really good example would be Coca-Cola’s recipe for Coke,
something that they don’t want to get out public, but they
hold secret in the company. And to qualify as a trade secret
in the U.S., usually there’s a whole bunch of ways that you
have to deal with that. For example, you have to make sure
it’s really clear that it’s a trade secret, you have to have all
the protocol in place to limiting information or who has
access to it and stuff like that. A patent is kind of the
opposite. The way that you protect yourself is by telling
everyone about what you’re doing. Kind of the underlying
purpose of patent law is the government says, “You tell us
what you’re doing and what you’ve invented in the form of a
patent, and if we grant it to you and decide it’s indeed
something new that no one else has done before, we are
going to publish it so everyone can see but we’re going to
give you a monopoly for a certain amount of time where no
one else can do it and you can actually enforce that right if
someone copies you.
So, with a trade secret, the only way that you can enforce it
is if someone has stolen it from you and then they go and
use it. But if someone independently comes up with the
same idea that you’re keeping as a trade secret, then you
can’t enforce that because they didn’t steal it from you. They
just came up with it.
Kirill: And moreover, they can go and patent it and then stop you
from using your trade secret.
Beau: Yeah, potentially. So, that’s kind of the difference between
both of those. And there’s business reasons for keeping
some things as trade secrets. A lot of times, things that are
kind of obvious but someone hasn’t thought of them yet, it
might make sense to keep it as a trade secret. Or, you know,
in the case of Coca-Cola, they don’t want their recipe public
because they might be able to have a patent on it for—well,
maybe not now because it’s gone on so long, but they might
be able to have a patent on it for 20 years, and then any
competitor could use their exact formula.
Kirill: Yeah, exactly. And that’s what you see when you go to a
pharmacy and you’re asked do you want the—I don’t
remember what the first word is, but they’re like, “Do you
want this type of drug or do you want the generic drug?”
Beau: Brand or generic, yeah.
Kirill: Brand or generic. So, the brand is the guys who patented it,
and then 20 years pass, and now everybody else is allowed
to make the generic version, which uses exactly the same
formula because they can use it according to the patent, it’s
just a different company. So technically, it’s the same thing.
Beau: Yeah. And the thought is, especially for pharmas, that
there’s a ton of risk in the R&D of creating drugs, so you
want to incentivize people to take that risk by giving them
that monopoly for a certain amount of time so that they
could be the ones that benefit from it.
Kirill: Yeah. And that’s also why you sometimes hear about these
drugs that cure certain diseases which are really hard to
cure—I can’t give an example because I just don’t know this
well enough—and one pill costs like $50,000. Even though it
only costs the company $50 or $100 to create that pill, to
put it together, they sell it for $50,000 because of the
amount of time and money they put into research and
development in the previous years. Now they have the patent
for 20 years, so they have to get that money back. Otherwise
nobody is ever going to be creating these drugs in the first
place.
Beau: Yeah. That’s the idea.
Kirill: Okay. That’s really cool. So let’s say I’m a data scientist, I’m
a freelancer and I’ve come up with this really cool new way of
doing machine learning, something that I don’t think
anybody has ever done before, and I want to patent it or I
want to create a trade secret. Is there any chance that I can
do that or do I have to be an organization?
Beau: You can definitely do it as an individual. I mentioned earlier
in the podcast that my dad has a number of patents. They’re
not in the data science space at all, but you just pay for it,
you find a patent attorney or you can even file it yourself.
That’s entirely possible. It’s extremely difficult – I would
highly recommend someone hire a patent agent or an
attorney just because I feel like governments intentionally
make the process so complex. But yeah, you can go out and
do it.
One thing I will say is, in the U.S. especially, it’s gotten a lot
harder to patent anything related to software in the past
couple of years. There was a case a couple of years ago
called “Alice v. CLS Bank.” For a proper interpretation you
should talk to a patent attorney, but it made it a lot harder
to get patents in that space. Not impossible, you definitely
can still do it. For example, if your invention improves the
functioning of the computer, it makes it faster or something,
those things can help. So I’d recommend, if you have
something new that no one else is doing, talk to a patent
attorney and figure out if you can protect it.
Kirill: Cool. Okay, so there are chances and people can somehow
protect themselves?
Beau: Yeah.
Kirill: Okay, fantastic. Thanks a lot for sharing that on the patent
and trade secret side. And I have one more question that we
didn’t discuss, which sounds really interesting. Your boss
found your LinkedIn and you said you had some good things
to say about LinkedIn.
Beau: (Laughs) Yeah. So, this goes to the undercurrent of my whole
career. We’ve just discussed my winding path. I’m a son of a
marketer. You know, I’ve been doing digital marketing in
some sense for the past 15 or more years.
Kirill: Yeah. You even have a website for that, right?
Beau: Yeah. I don’t know how many people I want to refer to that
because I’ve had some issues with the host. Now, I might say
it hasn’t been performing well, but yeah, LinkedIn—I’ve
gotten so many opportunities from LinkedIn, and especially
over the past couple of months. I decided to finally start
practicing what I preach about LinkedIn, and be a lot better
about posting and engagement. I post a couple of times a
week and my average post gets about 50,000 views. They’re
all usually on data science topics and hundreds of likes and
comments and it’s been really cool. I post about things that
are interesting to me, but mainly not sharing my opinion,
mainly wanting to ask what others’ opinion is. And there
have been some really great discussions on things that I
posted, really smart people sharing their opinion. You know,
I’ve gotten clients from LinkedIn that have just found me out
of the blue. My boss was just searching for data scientists in
Orange County and came across my profile because I’d done
some things to optimize it so I showed up in search results.
Yeah, I think it’s essential to pay attention to your digital
footprint in this time.
Kirill: For sure. Would you recommend LinkedIn to people who are
looking for jobs?
Beau: Absolutely. I think if you’re looking for a job, it’s absolutely
important. I think even if you’re not looking for a job, you’re
happy in your career, I think it’s incredibly valuable. One
good example is there’s a data scientist at Facebook named
Brandon Rohrer, you’ve probably heard of him. We actually
went to the same undergrad, but he’s very active on
LinkedIn even though he has no intention of ever leaving
Facebook. He’s clear about that. So he’s not looking for a
job, but he’s very active on LinkedIn, posts things that are
incredibly useful. I think it’s really good to be involved in the
global data science community even if you’re not actively
looking for a new job.
Kirill: Yeah, for sure. You can make some great connections, as we
said at the start, and you can just give back to people and
share your progress. It’s a great community to be in. People
do things for each other and they help and we grow together,
so why not, right?
Beau: Yeah, absolutely.
Kirill: Okay, thanks a lot. I just have some rapid-fire questions to
wrap this up. Are you ready for this?
Beau: Yes.
Kirill: Okay. What’s the biggest challenge you’ve ever had as a data
scientist?
Beau: In my current role, we have core services that we offer and
then every once in a while we’ll have a specific request for a
client. I mentioned that we’ve done some work in developing
new unsupervised learning methods to solve a specific
problem, and I think that’s been the most challenging thing
that I’ve done. I will never pretend to be a mathematician.
I’m a biologist and scientist first, you know. Math is an
incredibly useful tool, but I invented this new algorithm, this
new approach and it worked really well for this specific
purpose. It was really challenging but very rewarding at the
same time.
Kirill: Okay, cool. So that was the biggest challenge, gotcha. I can
imagine inventing something brand-new from scratch.
That’d be crazy hard. This might be related to this previous
one, but what’s a recent win that you can share with us that
you’ve had in your role that’s something that you’re proud
of?
Beau: I recently redid the deliverable that we prepared for our
client of our results. And I’m kind of a data visualization
nut, and I’m really proud of how that looks. As good and as
accurate or whatever your analysis is, if you don’t have a
way to present it to whoever is consuming it in a way that
they understand, especially if they’re not data scientists,
then your analysis doesn’t really mean anything. And I feel
like we’ve come up with a really good, well-designed
deliverable that conveys the complexity of what we do in a
simple way that non-data scientists can consume and use
and understand. So I’m really proud of that.
Kirill: That’s really cool. I can imagine. It’s very interesting that you
put the focus on visualization, because I completely agree. I
think you’ve already touched on the importance of
communication at the start of this podcast, and yeah, it's
totally true, especially in something like what you’re doing
which is so ground-breaking and different to what everybody
else is doing. You need to get people ready. You know, with
your method it almost sounds like at the very start of your
presentation, you should show them a quick animated
cartoon from Darwinism or something, biology, how natural
selection works, and then people will be on board with your
whole evolutionary programming-based methods.
Beau: Yeah, absolutely.
Kirill: Alright, cool. And what is your one most favourite thing
about being a data scientist?
Beau: I think it’s the scientist part. You know, I love doing science.
I love discovering new things and just going through that
whole process. It has so many applications, especially
powerful in business, and I think that’s a part that really
drew me back into data science full-time, is finding solutions
to questions and to problems.
Kirill: Okay, awesome. I can totally relate to that. And it really
resonated with me when you compared the differences
between finding the absolute truth in science and finding the
relative truth that will help you convince the judge and the
jury in law. A good contrast.
Beau: I’ve always been uncomfortable with that aspect of the legal
profession. (Laughs)
Kirill: Right. And no offence to the legal professionals out there, it’s
just that everybody has their own preferences, I guess, what
they like or don’t like. An interesting question: Where do you
think the field of data science is going? Like, from what you
know, from what you’re doing, from what you’ve seen in your
many lives and careers, what should our listeners prepare
for to be ready for the future that’s coming?
Beau: I think it’s already happening. Much, or a lot, of what data
science does is being automated. If it lessens the amount of
time that I have to spend cleaning and pre-processing data,
then great. (Laughs) So, I think there’s just going to continue
to be more automation. But I think there’s kind of a double-
edged sword with that. We need to know the reasons why
the AI that we’re using or whatever is making the decisions
that it does. And I think that the core value a data scientist
has is in producing value from data. And that requires that
communication, that visualization, the making sense of what
the algorithm spits out.
I think focusing on that is really the most future-proof kind
of thing for a data scientist. Is what I’m doing actually
providing value? Instead of just being a research exercise,
am I actually contributing to driving revenue for my
company or my clients? That’s kind of what I feel like will
continue to be important. The challenge now is not
accessing or creating data or recording data. It’s so easy to
do in the time that we live in, but making sense of it is not
going to go away. It’s something that’s really important.
Kirill: Fantastic. I totally agree with that. Yeah, very, very powerful
point of view that visualization and presentation and that
communication, being the communicator between the
insights, no matter how they’re gathered, whether
automatically or non-automatically, by hand, and the people
that it needs to be communicated to. That is definitely
something that is going to stick around for a long time.
Thanks a lot, Beau, for sharing, for coming on the podcast.
Beau: Absolutely.
Kirill: How can our listeners contact you, find you or follow you? It
sounds like LinkedIn might be one of the best options.
Beau: Yeah, LinkedIn is a great way. You can find me on LinkedIn.
I think Kirill will probably have my contact info. LinkedIn is
great. Also, through e-mail is good. I can give my e-mail, it’s
[email protected]. Either of those two ways is great. I’m
responsive on either. But don’t spam me, though. (Laughs)
Kirill: (Laughs) Fantastic, yeah. Definitely, guys, connect with
Beau and reach out. I’m sure there’s going to be more follow-
up questions to your story. And I have one final question for
you: What is your one favourite book that you would like to
recommend to our listeners that can help them become
better data scientists?
Beau: This is a book that’s a classic and came out way before the
term ‘data scientist’ did. It’s Edward Tufte’s “The Visual
Display of Quantitative Information.” I think going back to
the whole idea of how your data are interpreted is the most
important. I think any data scientist would benefit from
reading that book. The principles are just as important today
as they were when he originally wrote the book.
Kirill: Fantastic. Thank you for that. So, Edward Tufte: “The Visual
Display of Quantitative Information.” Check it out, guys, if
you want to be more like Beau. (Laughs) Okay. All right,
thanks a lot, Beau, for coming on the show, once again, and
sharing all of this. It’s been crazy and great.
Beau: Thank you, Kirill.
Kirill: Take care. So there you have it. That was Beau Walker and I
hope you enjoyed today’s episode. For me personally, the
most exciting part was of course the description of the
evolutionary programming-based machine learning. It’s a
very different space of data science and I really appreciated
that Beau actually shared the advantages that it has over
some of the existing approaches such as deep learning and
specifically the interpretability and also the fact that it
doesn’t require that much data in order for these models to
be run, which can be useful in some sort of business
applications.
So I hope you learned something new today and you might
consider these things for your personal career. You can find
the show notes at www.superdatascience.com/93. And there
you can also find the link to Beau’s LinkedIn, so make sure
to connect and hit him up. Of course, as we mentioned at
the very start of the podcast, connections are so important,
especially in this day and age. Even if you just connect with
people on LinkedIn, that could lead to unforeseen
opportunities in the future. You can also find the show notes
and transcripts at the same URL. And on that note, we’re
going to wrap up today. If you enjoyed today’s episode, we
have a quick favour to ask. Just head on over to iTunes and
leave us a rating or review. This will really help us spread
the word about data science and get even more people
enthusiastic about it. Thanks a lot for that. And I look
forward to seeing you next time. Until then, happy
analyzing.