50
SDS PODCAST EPISODE 321: THE LIFE OF ONE ADVANCED DATA SCIENTIST

SDS PODCAST EPISODE 321: THE LIFE OF ONE ADVANCED …career in data science. Thanks for being here today and now let's make the complex simple. Kirill Eremenko: This episode is brought

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: SDS PODCAST EPISODE 321: THE LIFE OF ONE ADVANCED …career in data science. Thanks for being here today and now let's make the complex simple. Kirill Eremenko: This episode is brought

SDS PODCAST

EPISODE 321:

THE LIFE OF ONE

ADVANCED DATA

SCIENTIST

Page 2: SDS PODCAST EPISODE 321: THE LIFE OF ONE ADVANCED …career in data science. Thanks for being here today and now let's make the complex simple. Kirill Eremenko: This episode is brought

Kirill Eremenko: This is episode number 321 with advanced data

scientist, Morgan Mendis.

Kirill Eremenko: Welcome to the SuperDataScience podcast. My name

is Kirill Eremenko, Data Science Coach and Lifestyle

Entrepreneur. And each week we bring you inspiring

people and ideas to help you build your successful

career in data science. Thanks for being here today

and now let's make the complex simple.

Kirill Eremenko: This episode is brought to you by DataScienceGO

2020, our very own data science conference. We've

already done three events in the past three years and

we're moving into our fourth year in 2020. And to give

you a feel for what to expect, here are some stats from

DSGO 2019. We had 620 attendees fly in from 25

different countries. 38 speakers gave talks, 150 plus

business decision makers attended the sessions as

well and get this, 2400 cups of coffee were drank

during the networking sessions.

Kirill Eremenko: So DataScienceGO is not just a place where you will

get all the top data science skills that you need for

your career. That's definitely a huge component of the

conference, but also it's a great place where the

community comes together to network. At

DataScienceGO, you will meet data scientists and

professionals from companies like Accenture, AIG,

Wells Fargo, MasterCard, Facebook, Google, IBM,

Microsoft, Salesforce, Teradata, Amazon, eBay,

Shopify, and many, many more.

Kirill Eremenko: So this is a great opportunity to meet and network

with your colleagues, to meet and start catching up

Page 3: SDS PODCAST EPISODE 321: THE LIFE OF ONE ADVANCED …career in data science. Thanks for being here today and now let's make the complex simple. Kirill Eremenko: This episode is brought

with your mentor or maybe to even meet the manager

at the next company that you'll be working for. At

DataScienceGO 2020, we've been almost doubling

every single year. So we're expecting about a thousand

attendees at this next event.

Kirill Eremenko: DataScienceGo is happening on the weekend of the

6th, 7th and 8th of November, 2020 and you can

already secure your tickets today at

datasciencego.com. And one more thing is that we

actually have different tracks. So we found that this is

a very important component for attendees and we have

tracks tailored to your experience. So if you're a

beginner, there's a beginner track which will help you

get the skills to break into data science. If you're an

intermediate practitioner, there's an intermediate track

for you to progress to advanced. And if you're already

advanced, there's an exclusive advanced track just for

you.

Kirill Eremenko: So whatever your level, you can find the right track,

the right talks, the right workshops, the right sessions

and case studies and panels at DataScienceGO. So on

that note, this is the best conference for you to attend

to skyrocket your data science career. So make sure to

secure your ticket at datasciencego.com today. And I

can't wait to meet you in person in California in

November, 2020.

Kirill Eremenko: Welcome back to the SuperDataScience podcast ladies

and gentlemen, super pumped to have you back here

on the show. Our today's guest Morgan Mendis is one

of the most advanced data scientists I have ever had

the privilege of meeting in person. Morgan and I met at

Page 4: SDS PODCAST EPISODE 321: THE LIFE OF ONE ADVANCED …career in data science. Thanks for being here today and now let's make the complex simple. Kirill Eremenko: This episode is brought

DataScienceGO 2019 a couple of months ago. And

since then his life has taken on so many interesting

twists and turns. You will be so excited to hear what's

been going on in his life. In this podcast you will learn

the story of how we met and why we got to chatting at

DataScienceGO in the first place. You'll also hear

about a VP of data science, a vice president of data

science role that Morgan is helping fill in Washington

DC. So if you're an advanced data scientist somewhere

in that area or if looking to relocate in that area, this is

going to be super exciting for you. Roles like that don't

just lie around on the ground. They're quite hard to

come by and this is an opportunity of a lifetime.

Kirill Eremenko: So listen up is going to be really, really exciting. It's at

the very start of this episode, you'll hear about that

role. And even if you're not looking to get into a VP of

data science position, maybe if it's a bit too early for

you, maybe is that something you're aspiring towards

in the future? It will be very interesting to hear what

kind of requirements are for in a role like that and

what is the goal of a role of a vice president of data

science. You will also hear about why Morgan decided

to turn down a very exciting opportunity in his career

and in order to follow his dreams, pursue his dreams

and passions and move to Haiti and what he's doing

there, the very noble and admirable cause that he's

helping with his data science skills called Ayiti

Analytics. You'll hear all about that and how you can

get involved if you're also excited about helping others

learn data science. So a very, very wonderful thing that

Page 5: SDS PODCAST EPISODE 321: THE LIFE OF ONE ADVANCED …career in data science. Thanks for being here today and now let's make the complex simple. Kirill Eremenko: This episode is brought

Morgan and the team at Ayiti Analytics are doing and I

was very Inspired to share this story.

Kirill Eremenko: And of course we also went through Morgan's

background and all of the great takeaways that he's

learned along his way to becoming an advanced data

scientist. So you'll learn about Excel and how for some

applications he still uses Excel and why it's important

to know which tool to use where, how you can

automate Excel with R and you'll get some very

valuable tips there, especially on how you can save

time to apply to do more exciting things in data

science. You learn about how Morgan mastered Python

and why and when he uses R, when he uses Python,

when he uses both. How he combined his data science

skills with his econometric skills and what that led to.

You'll also learn a lot about the ETL process in data

science, how to maintain models, why it's important.

And also Morgan went into quite a lot of depth on the

Airflow tool, a very cool tool for extract, transform load

procedures which you can already use in your career.

Kirill Eremenko: So if you've never heard of Airflow before, this is a

great opportunity for you to get up to speed with this

tool and see if it's right for you. Those are just some

examples of what you will hear on this podcast. I'm

very, very pumped about the conversation that we just

had and so let's not put it off. Without any further ado,

I bring to you advanced data scientist Morgan Mendis.

Kirill Eremenko: Welcome back to the SuperDataScience podcast ladies

and gentlemen, super excited to have you back here on

the show because today we have a very interesting

guest joining us. Calling in from Haiti, Morgan Mendis.

Page 6: SDS PODCAST EPISODE 321: THE LIFE OF ONE ADVANCED …career in data science. Thanks for being here today and now let's make the complex simple. Kirill Eremenko: This episode is brought

How are we going, Morgan? How's everything going for

you there?

Morgan Mendis: Good. How's everything with you Kirill?

Kirill Eremenko: Amazing. Everything is good and so cool that you're

calling in from Haiti like how do you pronounce it

again? You just told me I already forgot. How do you

pronounce the name of the Island?

Morgan Mendis: Oh, so it's Haiti in English, but people might know it

also in Creole as Haiti.

Kirill Eremenko: Mm-hmm (affirmative). Haiti.

Morgan Mendis: So it means land of many mountains.

Kirill Eremenko: The land of many what?

Morgan Mendis: Mountains.

Kirill Eremenko: Mountains. Okay. Well I've never been to Haiti. I would

love to go one day. And you told me your origin... Your

mother is from Haiti, right?

Morgan Mendis: Yes, my mother is Haitian and my father is from

Malaysia.

Kirill Eremenko: Okay, fantastic. Very cool. And it's very interesting.

You've only gone back there like, would you say 12

days ago?

Morgan Mendis: Yeah, so I just moved here about like 10 days ago to

start a new job down here.

Kirill Eremenko: That's crazy. Like so much has been going on in your

life. So let's rewind back a bit. So we met at

DataScienceGO 2019 in San Diego. That was what,

Page 7: SDS PODCAST EPISODE 321: THE LIFE OF ONE ADVANCED …career in data science. Thanks for being here today and now let's make the complex simple. Kirill Eremenko: This episode is brought

one and a half months ago. Right? And since then so

much has happened. So first things first, how did you

get to DataScienceGO? What were you doing there?

Because I thought you were actually, you live in

California.

Morgan Mendis: No. So what originally happened was about in 2018 I

saw on LinkedIn a bunch of people in my network in

data science, I posted about DataScienceGO being a

really awesome conference and that they were actually

able to connect with other data science practitioners

rather than with like industry and more corporate

sponsors. So I remember during that week I went

ahead and got the early bird special and was like,

"Yeah, I want to go to this conference in California."

And coincidentally through my work previously, I also

got accepted to be a presenter at another conference in

the San Francisco area. So I went out there and then I

told my employers at the time, I was like, "Hey, I'm

already going to be out here in California. Would you

guys mind supporting me going down to San Diego

while I'm out there to go check out this data science

conference I've already paid for?" And they said, "Yeah,

sure, go down there, check it out. And also if you can

make a plug for our new opportunity for head of data

science at the team." So that's how I ended up at

DataScienceGO.

Kirill Eremenko: Oh yeah. Yeah, okay. And then that's when we talked

and you mentioned that you're hiring for a VP of data

science. Did you manage to find anybody at

DataScienceGO?

Page 8: SDS PODCAST EPISODE 321: THE LIFE OF ONE ADVANCED …career in data science. Thanks for being here today and now let's make the complex simple. Kirill Eremenko: This episode is brought

Morgan Mendis: No. That was actually the reason I remember I came

up to you at the last day of the conference because I

was like, "Man, I've met a lot of interesting people, a lot

of students." And a lot of them had questions for me

and, I was really excited to give back, but I was also a

little frustrated because I was looking for somebody at

a higher level who could help support the team,

especially because I was interested in taking this

opportunity in Haiti.

Kirill Eremenko: Yeah.

Morgan Mendis: So that's when you invited me on the show. So I could

hopefully let it be known that they're doing really cool

and exciting things at Inspire and to meet somebody

that would be awesome.

Kirill Eremenko: Yeah, absolutely. And I wanted to say that I really

appreciated your feedback there and we actually very

actively took that on board and consider it. And I was

like, "Okay, why couldn't Morgan find a VP of data

science at the event? How can we, fix whatever that is

indicating? And we've actually worked a lot on the

event and at the next events at DataScienceGO

starting from like the next immediate one, we're

actually going to have a separate track for very

advanced practitioners like yourself. Like we already

had some advanced talks this time and I think you

mentioned you enjoyed like the Salesforce talk by the

head of data science at Salesforce, is that right?

Morgan Mendis: Yes, yes. She was talking about model comparison and

kind of the infrastructure and it was really great

Page 9: SDS PODCAST EPISODE 321: THE LIFE OF ONE ADVANCED …career in data science. Thanks for being here today and now let's make the complex simple. Kirill Eremenko: This episode is brought

hearing her perspective and the future of Salesforce

and data science there.

Kirill Eremenko: Yeah, so we already have a few talks like that, but next

year we're actually going to have a separate dedicated

track for advanced practitioners only. It will be very

exclusive limited seating to get as many talented

advanced practitioners like yourself in the conference

so that you can make network with each other and

also that you can give back to the rest of the data

science community. So definitely to give feedback on

boards. Thank you so much for providing that and

coming up to me and indeed for our listeners out

there, one of the reasons why I'm excited to have

Morgan on the podcast today is because this is so rare

where a company is hiring for a VP of data science.

Kirill Eremenko: So if you're an advanced practitioner or you've been in

data science for three, four, five, six years or so or

more and you really believe you can lead not just

several data scientists into their professional career

growth, but actually lead an organization in the space

of data science. Morgan's got a great opportunity for

you. This is actually... So as I understand, you were in

this position yourself and then you decided to leave to

go to Haiti to pursue your passion and dream in this

other company and now this position has freed up and

you're looking to help your previous company fill this

role. Is that right?

Morgan Mendis: Not, not exactly. So the opportunity arose early, right

about the same time I went to DataScienceGO, my

previous manager had left and they were interested in

me stepping up and taking on more responsibility. But

Page 10: SDS PODCAST EPISODE 321: THE LIFE OF ONE ADVANCED …career in data science. Thanks for being here today and now let's make the complex simple. Kirill Eremenko: This episode is brought

at the time I was like, I'm really passionate about

pursuing my dreams and that means I have to go to

Haiti. So I wanted to support them as much as

possible in finding this new person to fill in the role.

And I thought that it was really important that the

person have data science experience from the get go.

Not potentially somebody ladder who's making a

horizontal move from software engineering or like a

director of business analytics, but more focused

specifically on data science that has a vision, a

strategy for the tools, but also how can we incorporate,

design into, and human elements into like healthcare

data science.

Kirill Eremenko: Well, on that note, why don't you tell us a bit about

your company that you've just left and where this

opportunity exists and if anybody's interested, how

they can get in touch about this role?

Morgan Mendis: Sure. So Inspire is one of the largest online networks

for patients and caregivers to share information about

medical conditions. So Inspire is connecting people so

they can share information to help them really

understand the condition, but also find emotional

support through others. So through the position of

data science, we're trying to find different ways that we

can surface relevant content and information to people

so that they can find others who are like themselves

and who are going through these really complex

medical condition and potentially aren't able to get

that information directly from the doctors. So a huge

aspect of the work I was doing was trying to synthesize

information and all of this health data that they might

Page 11: SDS PODCAST EPISODE 321: THE LIFE OF ONE ADVANCED …career in data science. Thanks for being here today and now let's make the complex simple. Kirill Eremenko: This episode is brought

be getting in terms of medical jargon and translate

them to information that they can use to better plan

and understand their condition to really honestly

choose to live their lives in the best way that they can

based on their own values. So not the values that

potentially academic is prescribing them, but based on

giving them access to information so that they can

make their own decisions.

Kirill Eremenko: Okay. Wow, that's very noble. Would you be able to

provide an example of how you're using the data to

help patients who fight on given a specific condition,

something like just to put it into like a tangible output

somebody would get?

Morgan Mendis: Okay, sure. So we have a product at Inspire called...

So we're developing, it's called Health Profiles.

Kirill Eremenko: Mm-hmm (affirmative).

Morgan Mendis: Which actually allows you to, you answer a few

questions about yourself and then you're able to see

relatively in the community for example, what drugs

are you taking for your lung cancer. You will answer

these questions and then you could see other people in

the community who are taking similar drugs and

based on the privacy settings, because health

information is, security and privacy are really

important to us at Inspire. I'm saying Inspire as we,

but they're really important. So you want to protect

the patient and give them full control over their

information. But we still want them to be able to find

other people who they might be able to relate with and

connect with on a personal level.

Page 12: SDS PODCAST EPISODE 321: THE LIFE OF ONE ADVANCED …career in data science. Thanks for being here today and now let's make the complex simple. Kirill Eremenko: This episode is brought

Morgan Mendis: So we want to use the data to help them connect. So

they're able to answer these questions and they're able

to find other people who've answered similarly and

then reach out to those people if they've said, "Hey,

you can share my information with others in the

community." So that they can then ask them follow up

questions about, "Hey, I've been getting this side effect

from taking this drug. Have you experienced this as

well?" Right. Or they might say, "Hey, I've had a really

tough time adhering to my medication treatment

schedule, what are some tips that you do in order to

stay on top of your medication schedule?"

Kirill Eremenko: Okay. Okay. Got you. That's a very, I would imagine

helpful service to people out there who are going

through these tough challenges. So hats off to you and

to Inspire for doing this. This is a very cool

undertaking. And tell us a bit about the role. So this

VP of data science, how big would the team be?

Morgan Mendis: So the team is expected to grow. When I left there was

about five members on the team and from the

indication of the CEO, they really expect that data

science is going to become the bread and butter of the

company. So I wouldn't be surprised if the data

science team were to grow to being something a 15

plus. Again, I can't speak too much about it because

I'm not currently at the organization. However, they

see the strategic importance of data science and they

really want to find new ways of leveraging the data

that they currently have and potentially looking to use

existing open data to augment the data that they have

as well. So for example, there's a blue button that CMS

Page 13: SDS PODCAST EPISODE 321: THE LIFE OF ONE ADVANCED …career in data science. Thanks for being here today and now let's make the complex simple. Kirill Eremenko: This episode is brought

or the Centers of Medicare and Medicaid Services as

well as the veteran affairs have opened up via fire

protocols.

Morgan Mendis: So this is also the same protocol that Apple HealthKit

is using. So currently if you're, let's say you're a

veteran, you might be able to go in and get all of your

medical information from veteran affairs, right? But

this data's coming out to you on a text blob. How do

you visualize? How do you use that information?

That's a challenge currently for data scientists in the

healthcare space, is that, we need to take this

information and we need to be able to analyze and

present it back to people in a way that then they can

make their own decisions about their health and how

they pursue medical services. So it's a huge political

debate in the U.S. right now about how our medical

system and how we pay for medical services. But one

thing that's not debated by anybody is that patients

should be in control of their medical information and

they should be in control of their medical care.

Kirill Eremenko: Mm-hmm (affirmative). Yeah, totally. Totally got you.

And what kind of a person is the company looking for

this position? For VP of data science. What kind of

experience?

Morgan Mendis: So they're definitely looking for somebody who's got

experience getting their hands dirty, but they're also

looking for somebody who has experience strategically

envisioning how to lead a team and how to take a

product from prototype all the way to production and

deployment. So they definitely want somebody who

has the engineering experience and the analytical

Page 14: SDS PODCAST EPISODE 321: THE LIFE OF ONE ADVANCED …career in data science. Thanks for being here today and now let's make the complex simple. Kirill Eremenko: This episode is brought

experience a little bit, kind of that business acumen,

but is also willing to get a little bit deeper into the

weeds of the technical depth of evaluating models,

evaluating technologies. So of course, right? There's

the one thing about the idea of it being unicorn, but I

think a key thing that we also want to push in this role

would be the idea of understanding the importance of

the patient being at the center of it.

Morgan Mendis: So definitely have to have a little bit of engineering and

the product mindset, but I think that a really big

important thing to Inspire is the culture. So making

sure that somebody is able to understand the

importance of patient centricity and autonomy. We

don't want somebody who's pushing like, "Hey, this

data science, it's going to solve everything. They just

need to give us their data and then we'll manage

everything." No, it's got to be a give and take

relationship. So if they give us data, we need to be able

to give them back something so that they have an

incentive in order to share their information. Because

Inspire is, it's mission and goal is actually to promote

medical science. So it's trying to work together. It has

relationships with the FDA and other research bodies

trying to understand how can they better understand

these conditions from the patient perspective. And so

at Inspire, they call that the patient voice. So having

someone in a role who understands the technologies,

but also understands the human element, I think

that's really important for Inspire.

Page 15: SDS PODCAST EPISODE 321: THE LIFE OF ONE ADVANCED …career in data science. Thanks for being here today and now let's make the complex simple. Kirill Eremenko: This episode is brought

Kirill Eremenko: Mm-hmm (affirmative). Mm-hmm (affirmative). And

where is Inspire located? Where would this role be

based?

Morgan Mendis: So this would be based on the D.C. area, Inspire's

located at Arlington, Virginia.

Kirill Eremenko: Okay.

Morgan Mendis: Right. It's going to be right next to new Amazon

headquarters.

Kirill Eremenko: Nice. D.C you mean Washington D.C. right?

Morgan Mendis: Washington, D.C. Sorry.

Kirill Eremenko: Okay. Got you. Awesome. Well, and finally, if

somebody is interested, how do they get in touch? I'm

assuming you would make the referral. How do they

get in touch with you?

Morgan Mendis: So they can reach me obviously on LinkedIn. So

Morgan Andrew Mendis, M-E-N-D-I-S. That's my last

name. But yeah, please feel free to send me a message

on LinkedIn. I'm happy to share more information.

Kirill Eremenko: Got you.

Morgan Mendis: But if you also want to look up the Inspire website, it's

www.inspire.com.

Kirill Eremenko: Nice. Very cool. And we will of course show those

things in the show notes. So if anybody out there is

interested in a VP of data science position, which we

don't talk that often about on the podcast, there's your

opportunity. And also I think this was very cool to

even, if you're not interested, I think it shows an

Page 16: SDS PODCAST EPISODE 321: THE LIFE OF ONE ADVANCED …career in data science. Thanks for being here today and now let's make the complex simple. Kirill Eremenko: This episode is brought

example of what people are looking for in a VP of data

science role and what those roles are like. So yes,

thanks a lot Morgan for sharing that. Hopefully you'll

get some really cool applicants for this role. And it

sounds like the company is doing some very good

things for the community.

Kirill Eremenko: And so tell us a bit about your dream now,

congratulations first of all, you're pursuing your

dreams. You're in Haiti, you've completely changed the

course of your career now and your life, sounds really

exciting. Tell us like, how did all transpire?

Morgan Mendis: I was just a little kid, back in Maryland and I was

dreaming of, how can I eventually grow up to give back

to the world? And I was telling you earlier that it all

stemmed from the idea that I want there to be more

economic opportunity in the world, economic

development. And it kind of stems from, in order to

have economic opportunity, you need to have some

kind of education. You need to have some kind of skill.

But before you could pursue education, you need to be

able to pursue, have good health. So that's why I really

got into healthcare.

Morgan Mendis: My goal was to get into global health. However, the

positions are extremely competitive. And that's why I

was board on an insurance company and I was

working part-time during my final semester of college.

So I started learning R and that's kind of how I got into

data science. And that kind of led me down the road to

eventually finding myself in a position where here in

Haiti they needed somebody who was able to do data

analysis and data visualization and really string

Page 17: SDS PODCAST EPISODE 321: THE LIFE OF ONE ADVANCED …career in data science. Thanks for being here today and now let's make the complex simple. Kirill Eremenko: This episode is brought

together all of their information systems. And that's

how I ended up here in Haiti.

Kirill Eremenko: And you're now the principal data scientist at, what's

the company called?

Morgan Mendis: So yeah, I'm actually doing two things down here in

Haiti. One is that I'm working with a nonprofit

organization called the Caris Foundation. And there I

took the title as health informatics consultant.

However, I'm also working to start the first data

science lab here in Haiti called Ayiti Analytics. And

there we're trying to train the first generation of data

scientists in Haiti. So we're really excited to hopefully

push forward the opportunities of Haitians to pursue

analytics and to use data science to improve the state

of the nation.

Kirill Eremenko: Wow. That's another very noble cause. And yeah, what

kind of challenges, you only started this like a few

weeks ago. What kind of challenges are you expecting

along the way?

Morgan Mendis: So I was mentioning to you earlier that there's a lot of

current political instability in Haiti. So it's tough to get

people to come into the office every day. Internet

access might be intermittent here. Sometimes people

lose power. So it's really difficult to have some kind of

cadence in terms of scheduling. So we understand that

for people sometimes to get to a location where we're

having the onsite trading, that could be a challenge,

getting around the city, there's often roadblocks. So

it's common here for you to be in many WhatsApp

groups in order to get information on which streets are

Page 18: SDS PODCAST EPISODE 321: THE LIFE OF ONE ADVANCED …career in data science. Thanks for being here today and now let's make the complex simple. Kirill Eremenko: This episode is brought

available. So that's one challenge, is transportation

and for people coming in.

Morgan Mendis: Another challenge obviously is that for people to take

the time off, to spend, to learn data science, to invest

that time, they need to potentially not be working or

they might need to be working twice as hard as other

professionals in other countries where they might be

able to have a safe location where they can have

internet access, to learn the code. A lot of data science

learning requires online content and if you can't get

people physically in your proximity to help mentor

you, it's difficult. You need to be able to reach them via

the internet and that's a challenge here. So we are

experiencing that challenge, another challenge is that

there's not a lot of resources in Haitian Creole and in

French for data science. So I was very excited that you

are telling me that SuperDataScience has some of their

courses in French because that's definitely something

we're going to be interested in.

Kirill Eremenko: For sure. And as I mentioned, I would be more than

happy to support a mission like that and provide free

SDS accounts to your community to make sure that

they're learning and like we can do as much. And I

always love when people are doing things like this and

this was like what was surprising me when you were

speaking just now that you effectively turned down a

vice president of data science opportunity in a up and

growing company which is making a massive impact in

the world. It's something that people would just love to

have, a lot of people are aspiring to have and build

their career towards the VP of data science position.

Page 19: SDS PODCAST EPISODE 321: THE LIFE OF ONE ADVANCED …career in data science. Thanks for being here today and now let's make the complex simple. Kirill Eremenko: This episode is brought

You evicted turned that down to move to a country

where you're facing lots and lots of challenges and at

the same time like by your voice, I'm sure our listeners

will agree, I can tell you are happy, you're like excited,

you're living your dream.

Kirill Eremenko: How does that add up? Like you turned down a

massive progression in your career in order to follow

your dreams in a completely different environment,

which seems like much less secure, much less safe,

and yet you're very happy. Tell us, how do you feel

about all this?

Morgan Mendis: So one thing I have to say is that Haiti is one of the

most beautiful countries in the world. And I know

Kirill that you were talking earlier about that we both

are avid travelers. I'm just blown away being here. I

love the culture, I love the food, I love the music. It's

such a beautiful place. And I think that when you talk

about the opportunities of like people in their career

progressions, some people, they want to be in a

management position, they want to be in a position

where they're being able to bake big, high level

strategic decisions. I have always been on more of the

community organizing side of things where I want to

be down with the people. I love learning languages

because I like talking to people and I like learning

about their problems. And generally what I've noticed

is that providing simple solutions that aren't

complicated, that can change somebody's life to me, is

much more rewarding than building these really

complicated tools and models that kind of sit behind,

Page 20: SDS PODCAST EPISODE 321: THE LIFE OF ONE ADVANCED …career in data science. Thanks for being here today and now let's make the complex simple. Kirill Eremenko: This episode is brought

sit in a server somewhere that nobody ever sees and

potentially doesn't change their life.

Morgan Mendis: It might make somebody, a few more percentage points

of ROI on their investment vehicle, or it might save a

couple dollars for a supply chain. However, for me it's

about changing somebody's life. It's about talking to

people, it's about hearing about how can I make it so

they can live a more rewarding life. Because, data

science has given me the opportunity to live a

rewarding life. Our education as a society, our

development as a civilization has all been towards,

pushing the whole race forward. It's not about us

individually or us and tribes, it's us about, us coming

together. So I'm really excited to be here and yeah, I

think that's about different interests, different

passions. You have to choose what do you find

valuable in life?

Kirill Eremenko: Mm-hmm (affirmative). Mm-hmm (affirmative). Yeah,

no, that's absolutely true. Your fulfillment, happiness,

they don't really come from accomplishment or going

up the career ladder. And sometimes it's necessary,

sometimes you might find exciting but sometimes you

just want something else in life and that's totally

normal. It's important to be able to let go of things and

move on to, as you said, you're following your dreams.

It's like no better place to be. So very excited for you.

Very pumped up. Can't wait to hear some of the great

things you'll do. Maybe like in a year, a year and a half

we can do another podcast where you'll tell us about

all those things you've created in Haiti and how many

Page 21: SDS PODCAST EPISODE 321: THE LIFE OF ONE ADVANCED …career in data science. Thanks for being here today and now let's make the complex simple. Kirill Eremenko: This episode is brought

people you've gotten up to speed of data science. I

think that'll be very cool.

Morgan Mendis: Yeah. Well I have to push you is that, I would argue for

me success would be that you are inviting somebody

from Ayiti Analytics who we've trained and who we've

got off the ground to be on the show to talk about what

they've done in maybe the last nine months.

Kirill Eremenko: Nice.

Morgan Mendis: That would be to me success. That would make me

happy.

Kirill Eremenko: That was awesome. All right. Okay. Let's blend that in.

Sounds like a good idea. Okay. So let's take this

opportunity that we're on the podcast and what I

would love to do is you are by far one of the most

advanced data scientists I've encountered in my,

speaking with data scientists, meeting people,

traveling around the world. And I want to... That's one

of the other reasons why I invited you on the show. I

really want to share your experience of advanced data

science with our audience and be like what kind of

takeaways they can get. So the things that you're

teaching in Ayiti and the people that you get up to

speed, I think they're going to be very lucky to have

such a great mentor as you leading them. So I would

love to see what kind of insights you can also share

with our audience here. How does that sound? Do you

mind going through a couple of your case studies or

use cases of data science that you've done in the past?

Morgan Mendis: Sure, I'll have to preface it with the fact that I don't

consider myself even within the top 10% of data

Page 22: SDS PODCAST EPISODE 321: THE LIFE OF ONE ADVANCED …career in data science. Thanks for being here today and now let's make the complex simple. Kirill Eremenko: This episode is brought

scientists. So I appreciate the compliment, but I think

that what I do is that I take models and I take

whatever tools best fits the situation and hey,

sometimes that's Excel. I make that argument all the

time is that you don't need advanced tools. Sometimes

you just have to use the tools to the full capability or

full extent. But yeah, I'm happy to go through some of

the case studies that we've discussed earlier.

Kirill Eremenko: Mm-hmm (affirmative). Sounds good. All right, well

let's get started. So are you going to be fine if we go

through your experiences like post-graduation, like

one by one like ChenMed, Aledade and so on.

Morgan Mendis: Sure.

Kirill Eremenko: Is that better?

Morgan Mendis: Yeah, yeah. No, that's fine.

Kirill Eremenko: Awesome. So in that case, let's get started, and

perhaps let's just go through your experiences after

graduation one by one. You mentioned that, in your

email that the first role that you had post collage

graduation was at ChenMed. So like what did you do

there and what kind of tools did you use?

Morgan Mendis: Sure. So I first, when I got out of college, I started as a

business analyst at ChenMed. So they are a medical

provider. They run several different facilities across

South Florida and they've opened up across the

Southeast. But my position really was using Excel to

do a lot of business analytics. So that was writing

reports, copying charts, and putting them into

PowerPoint and all of this. I was a little frustrated. I

Page 23: SDS PODCAST EPISODE 321: THE LIFE OF ONE ADVANCED …career in data science. Thanks for being here today and now let's make the complex simple. Kirill Eremenko: This episode is brought

was like, we can be automating a lot of this before.

And so what I started doing was that I started

automating a lot of the work that I was doing in R, and

I would actually show up really early in the morning

before everybody else, seven o'clock. So I could start

automating the reports and my manager had no idea

that I was secretly automating the reports. But the

reason why I would stay late is because I was then

using the data to then explore different tools and

different libraries in R.

Morgan Mendis: So one of the things was, we had a challenge of

understanding how to move different... One of the

patient offices was closing and we wanted to reallocate

people to different offices within the geographic region.

Very simple. Took the data, converted into longitude,

latitude, place it with Gigi Maps onto a map. And then

I was able to calculate the distances to the local offices

and say, "Hey, we can just put all of the patients to

this office and they won't reach capacity. We can put

the other patients this office and they won't reach

capacity." And based on proximity, they'll still be able

to do it within their regular commute. And it won't be

that much of a transition for the patient populations.

So very, very simple things like that. Taking existing

tools, right? But my key thing was, "Hey, I already

know these tasks exist out there. Let me just try and

automate them and that way I can get more access to

the data to explore what else we can do."

Kirill Eremenko: Mm-hmm (affirmative). What's an example of the

automation that you were performing in Excel with R?

Page 24: SDS PODCAST EPISODE 321: THE LIFE OF ONE ADVANCED …career in data science. Thanks for being here today and now let's make the complex simple. Kirill Eremenko: This episode is brought

Morgan Mendis: So there was a lot of opening up different

spreadsheets, setting up linkages, lots of different,

what we would consider typically to be unions or

concatenations of the datasets. And so that was a

simple thing is that I would have to wait potentially for

new data set to come in or new Excel file to come in

before I could potentially do my full reporter, run my

aggregations, like the pivot tables and to me it was

like, "Pivot tables are cool, but there's nothing special

about them. We were familiar with them in Group By

and SQL. So I started just saying writing up scripts. I

was like, okay, once the data comes in, boom, run the

script and the data's going to be pumped down to a

new Excel spreadsheet. I just have to then move a

table into a PowerPoint and focus more of my time on

analyzing the data for the key insights that we'd be

providing back to the office professionals rather than

spending my time trying to make sure that my indexes

and my VLOOKUPs match up.

Kirill Eremenko: Mm-hmm (affirmative). Mm-hmm (affirmative). Okay,

got you.

Morgan Mendis: So it's about taking the time so that you can focus on

analysis and understanding the data rather than

focusing on, "Hey, am I looking at the right file? Hey, is

there any kind of data validation errors going on?"

That's another example is that, if you need to validate

the data or if you necessarily had issues with

potentially the formatting in Excel, with R, with

Python, you can automate most of that duel, do unit

testing in order to validate the data and so that you

don't necessarily need to spend your valuable time as

Page 25: SDS PODCAST EPISODE 321: THE LIFE OF ONE ADVANCED …career in data science. Thanks for being here today and now let's make the complex simple. Kirill Eremenko: This episode is brought

an analyst checking these small little boxes. You can

spend more of your time understanding the data,

understanding what was the process of which the data

got to you by and potentially how can you make it

more valuable when you send it off to the next person.

Kirill Eremenko: Mm-hmm (affirmative). I love it. So you're freeing up

your time from checking small things or concatenating

data, doing these recurring tasks in order to have

more time for exploration. That's a really, really cool

thing. And it sounds a lot like a robotic process

automation, this type of automation, were they, the

scripts that you wrote, did you need to like run them

yourself or were they running in the background like

every night or something like that?

Morgan Mendis: So again, this is part of me actually, I had to write the

scripts and there was no automation. I couldn't set up

a cron job at the time. So they, at this organization at

ChenMed, they were pretty tight about what tools you

could use and what access you could get to the

internal systems. So there was no way that I could

just, okay, I'm going to go hack into their system and

set up this cron job or set up this automation process.

I wrote the scripts, so that's why I would come in early

and I didn't mind coming in early because then I said,

"Hey, I get to spend the rest of the day exploring and

innovating with the data. So I don't mind coming in

early to just run the script. I can do it while I'm having

my cafecito." This was while I was in Miami.

Kirill Eremenko: Nice.

Morgan Mendis: So I love the Cuban coffee.

Page 26: SDS PODCAST EPISODE 321: THE LIFE OF ONE ADVANCED …career in data science. Thanks for being here today and now let's make the complex simple. Kirill Eremenko: This episode is brought

Kirill Eremenko: Got you. Okay, cool. Very cool. I think it's a very

valuable insight or career advice for people. Like if you

find yourself doing recurring things where stuff can be

automated. And I love your dedication. Like already

you can tell you're loving what you do. You come in

early, you stay late, you're having fun along the way.

Fantastic. What was next, where'd you go after

ChenMed?

Morgan Mendis: So after that, because I wanted to get more access to

actually deploying more advanced data science and

actually using more tools, I took a job at Aledade as a

data analyst. And so the first day when they hired me,

they were like, yeah, you can use R. That's great. And

once I started the engineers looked at me and said,

"No, this is a Python shop. You got to learn Python."

Kirill Eremenko: So they didn't know what they use at the interview?

Morgan Mendis: No. So originally the rest of the analytics team at the

time was using SAS and I had studied econometrics of

strata in my undergrad, so I was kind of familiar with

the idea of SAS, but I was like, "Hey, it's not open

source. It's not my go to tool." But when they

suggested I learn Python, I was like, I've been looking

for an excuse to really ramp myself up on that. And so

luckily I had a really good mentor at the company, a

gentleman by the name of Jim Fulton, who really

provided a lot of guidance to me in learning Python

and learning some very good standards for software

development. And I don't consider myself a software

developer by any means, but he definitely helped to

guide me along and help me learn about a lot of the

tools, even though he wasn't working in data science

Page 27: SDS PODCAST EPISODE 321: THE LIFE OF ONE ADVANCED …career in data science. Thanks for being here today and now let's make the complex simple. Kirill Eremenko: This episode is brought

about how you could use Python for data science. He'd

been using Python for so long, he's like, "No, of course,

this makes sense. It's a great tool for data analytics."

Morgan Mendis: So, at that point I started exploring Python and

PostgreSQL and really got excited about it because I

was like, "Oh man, this opens up a whole new set of

tools and opportunities for me to either automate or

explore new modeling techniques and connecting

different tools." So again, for me it's all about finding

the right set of tools to improve the process and then,

as you improve the process, you're going to get a lot of

gains across the organization. So I was really enjoying

at that point, but a lot of my most enjoyable

experiences there was actually more leveraging my

training and economics to do econometrics.

Morgan Mendis: So at Aledade they are working for what's called

accountable care organizations. And they were, in a

similar vein to ChenMed. They were trying to push

patients centricity in a new movement that was called

value based care. And so we started exploring how can

we analyze the data to make, make the system more

effective to providing optimal care to patients while

also reducing the cost of care. Right. So that's a huge

issue is that, healthcare is very expensive. So is there

potential, not necessarily to reduce the medical

services but instead to say which kinds of medical

services are going to have the greatest gains for

patients.

Kirill Eremenko: Okay, very interesting. Okay. And so you were able to

actually leverage econometrics and combine that with

Page 28: SDS PODCAST EPISODE 321: THE LIFE OF ONE ADVANCED …career in data science. Thanks for being here today and now let's make the complex simple. Kirill Eremenko: This episode is brought

data science. What are your takeaways from that? Not

often do those two get combined by data scientists.

Morgan Mendis: Yeah. So I think the really interesting aspect was that,

and this is what I really started really understanding

the enjoyment of doing data science, was that we are

using statistics to do these evaluations and to better

understand, what we call medical interventions or

different treatments for different patient populations.

And the really interesting thing is that, I was able to do

like some complex analysis and make some strategic

decisions which changed our program delivery in

terms of how we suggested our health system work.

And I remember years later after doing this analysis, I

was reading, and health affairs, which is a really

popular journal for health data and analysis. And a

group of researchers had published some findings that

I had already uncovered earlier at this company. The

difference though was that after I did my analysis, I

had found some discrepancies in what our original

hypothesis was.

Morgan Mendis: And so I went down and did deep dives into the data.

Then I went up and I called actual people who are

working in the sites to come up with a patient

narrative like so to better understand the data and

then we were able to use that narrative to explain the

model and explain the results to other people. Because

a lot of people don't want to see, your regression

outputs, right?

Kirill Eremenko: Mm-hmm (affirmative).

Page 29: SDS PODCAST EPISODE 321: THE LIFE OF ONE ADVANCED …career in data science. Thanks for being here today and now let's make the complex simple. Kirill Eremenko: This episode is brought

Morgan Mendis: That's not the thing that's going to change their mind.

What's going to change their mind is if you give them a

story, give them a story that they can remember so

they can better understand, in the future be like, "Oh,

this kind of sounds like what Morgan was telling me

about the story." And so I thought it was really

interesting that I remember reading through the article

and at the end they identified this area but they didn't

provide any narrative or any explanation. And that was

the benefit to me was that, having this opportunity to

actually within a data science capacity, to actually be

able to use this advanced methods and get to the same

conclusions as academics, but being able to work with

actual practitioners to come up with a narrative that's

going to change the system. That's the exciting thing.

Kirill Eremenko: Is that why you used Tableau in this role to explain

these things in a more visual way?

Morgan Mendis: Yes, yes and no. So Tableau was one of the tools that

we use to obviously help tell the narratives. We also

use Tableau as a business intelligence tool and it just

allowed us to rapidly take all of the massive amounts

of data we had and quickly turn it out. So when I

think about Tableau and I think about its value, I

think about how quickly you can transform data into

insights rather than, you still need to, it's not going to

replace the opportunity to spend time with people

talking to people. And I think that that's the one thing

I want to emphasize here is that, Tableau is great for

being able to present information, but it's not going to

replace the storyteller. It's just an aid or tool to, help

you as a storyteller.

Page 30: SDS PODCAST EPISODE 321: THE LIFE OF ONE ADVANCED …career in data science. Thanks for being here today and now let's make the complex simple. Kirill Eremenko: This episode is brought

Kirill Eremenko: Mm-hmm (affirmative). Okay. Absolutely.

Morgan Mendis: You got to make sure you have a story, story tell and

there's a quote that I'm reminded of is that "stories

happen to those who knew how to tell them".

Kirill Eremenko: Wow. Nice. Very beautiful. Very beautiful quote. Okay,

so you manage to combine these two fields,

econometrics and data science. What else did you

experience in this role? Because there are some other,

tools you use not just Python, also is R. You also use

both Python and R in this role. Is that right?

Morgan Mendis: So I really preferred actually the statistical output from

R, so there are some regression models at the time

that I couldn't find in Python and I just felt a little bit

more comfortable with the robustness of all the output

that R was giving me. And as your listeners may well

know is that, R was designed really by statisticians. It

was born out of S. So there's a lot of model

development historically that's been done in R, and

there's a lot of really interesting modeling that's or

innovative modeling that's been done in R previously

before it got into Python or before data science blew up

to what it is today. So at the time, I really liked using

Python because it helped me connect my different

solutions, but I like to use R when I was actually doing

the evaluations. But if I was building a new data

product for example, or connecting some SQL to build

a web application, then I was going to go to Python

and I might then just have Python execute an R script

if needed.

Page 31: SDS PODCAST EPISODE 321: THE LIFE OF ONE ADVANCED …career in data science. Thanks for being here today and now let's make the complex simple. Kirill Eremenko: This episode is brought

Kirill Eremenko: Okay. Yeah. Makes sense. I've heard of that done

before. But since then, have you moved completely to

Python or are you still using a combination of the two?

Morgan Mendis: So I definitely go to Python. Like, I just got, at my new

position, I haven't installed R, R studio yet, but for

example, I was working on the side doing a research

project with a former colleague and he asked me, he

was like, "Would you mind doing this in R so that I can

follow along with your code?" And I said, "Of course."

That's 100% of valid reason to use a language is that

you can collaborate with others. If your team isn't

using Python, don't force Python on them. Use the tool

that's going to best enable you guys to work together.

Because to me, collaboration is way more important

than your personal speed in a language.

Kirill Eremenko: Mm-hmm (affirmative). Yeah, no, totally. Totally agree.

So as I understand you're mostly using Python now,

but do you still use both from time to time?

Morgan Mendis: Yeah.

Kirill Eremenko: Oh, okay. Got you. Anything else? Did you use any

advanced, I don't know, maybe like deep learning in

that role?

Morgan Mendis: Oh no, no. I have not actually had the opportunity to

explore deep learning in production. I've only been able

to play around with it in my like personal side

projects, but nothing in production yet.

Kirill Eremenko: Okay, got you. And so when you would deploy models

into production, would you then later maintain them,

Page 32: SDS PODCAST EPISODE 321: THE LIFE OF ONE ADVANCED …career in data science. Thanks for being here today and now let's make the complex simple. Kirill Eremenko: This episode is brought

and make sure that, like check up on them that

they're working well?

Morgan Mendis: Yeah, I think that's a key thing of course is that every,

depending on the data interval of your data. So for

example, we might be reevaluating models on a

monthly basis because for example at Aledade they

were getting data from the government at a monthly

basis. And because of the lack in terms of getting

complete records and claims data, you might only be

getting a portion of a given month. So you're going to

have to wait almost three or four months before you

have a full picture of all of the events that transpired

in a given month. So you have to keep updating your

models regularly to incorporate the new data or

potentially corrections to the data that might be

coming. So, especially if you're dependent on data

coming from a third party, right? They might change,

they might change their methodology for how the data

is being sent over to you. Right?

Kirill Eremenko: Mm-hmm (affirmative)

Morgan Mendis: Or some other kinds of procedures. So you need to be

able to quickly, you need to be regularly, I'm sorry, not

quickly. You need to be able to regularly go back and

reevaluate.

Kirill Eremenko: Mm-hmm (affirmative). Okay. Got you. Awesome. Well

thanks for sharing this role at Aledade. It sounds like

a very important step in your career where you've got

to learn Python and also apply econometrics in

combination with data science. Yeah. What was the

next step after that?

Page 33: SDS PODCAST EPISODE 321: THE LIFE OF ONE ADVANCED …career in data science. Thanks for being here today and now let's make the complex simple. Kirill Eremenko: This episode is brought

Morgan Mendis: So I joined Inspire the position we were just talking

about, which they're hiring for the VP of data science

and I was actually the first data science hire at the

company.

Kirill Eremenko: Oh, okay.

Morgan Mendis: So, yeah. So it was a really, I was kind of a green field

opportunity and I actually got to learn a lot about all

the different aspects of data science in terms of

building up first key analytics and then moving to

challenges such as buy versus build. So, I think I ran

the gamut at Inspire the different things I did. So I had

some fun, building out like time series regressions to

forecast member growth, obviously that's fun

modeling. The other thing was, working with AWS and

setting up different systems to build a data lake. So

just to start there is that because it was a green field

opportunity and I was the first data science hire, I had

a different perception of the way systems need to work

in order to produce high quality analytics. And so the

company had existing reports that they needed. But

for me I wanted to get into the exciting work of "Hey,

let's start building exciting data science products."

However, I need to have the data in a format and in an

environment that's accessible and is going to allow for

development.

Morgan Mendis: So the first thing I had to do was start designing a

data warehouse and building out an ETL process. So I

want to say that probably half of my job was actually

data engineering and then maybe a quarter was

actually doing exciting data modeling in data science.

And then another quarter was actually doing much

Page 34: SDS PODCAST EPISODE 321: THE LIFE OF ONE ADVANCED …career in data science. Thanks for being here today and now let's make the complex simple. Kirill Eremenko: This episode is brought

more of like the analytics management of saying these

are the tools that we need to put in the place, these

are the kinds of resources we need to do and this is

how we need to prioritize all of these different

objectives and opportunities.

Kirill Eremenko: Mm-hmm (affirmative). Okay. Interesting so that you

separate them like that. So data engineering would

involve putting the right datasets together and making

sure that all data is flowing properly. Is that about

right or is there other elements to that role?

Morgan Mendis: No, so I would say that it really encapsulates much

more than just building the right datasets. It's about

identifying where the data's coming from and then

mapping out the best processes to putting them into

an environment that's accessible for data scientists or

data analysts. So I say that more, it's more than just

creating data sets because they have, I think a good

data engineers, is like critical to any team. Again, I

wouldn't consider myself a good data engineer. It was

more of, you have to get this work done in order to do

the fun stuff. So it's actually more much more related

to software engineering, I think in the sense that you

need to be able to put these systems together in a

sustainable way so that the data analysts, the data

scientists don't need to worry so much about cleaning

and validating the data. They can spend more of their

time analyzing it and talking with stakeholders to

building products.

Morgan Mendis: So in that role I was, I first took the data, I explored

AWS data pipelines, that didn't work. I thought about

just writing a bunch of my own scripts and setting it

Page 35: SDS PODCAST EPISODE 321: THE LIFE OF ONE ADVANCED …career in data science. Thanks for being here today and now let's make the complex simple. Kirill Eremenko: This episode is brought

up with cron jobs. And I was like, that's not

sustainable. So eventually I settled on Airflow. And

this is Apache project that's, I think it was originally

developed by Airbnb and it's pretty amazing. It just

allows you to set up a lot of jobs that happen in

parallel so you can move data from one system to

another, process it multiple times, and you're actually

able to create a directed acyclic graphs. So you can see

in a network how your data's flowing and potentially

where there's bottlenecks or potentially where there's

errors that are going to break down your ETL process.

Kirill Eremenko: Okay. Wow. And so you settled down on Airflow and

that solution is still running now?

Morgan Mendis: Yeah.

Kirill Eremenko: Wow.

Morgan Mendis: And yeah, I'm hoping to set up a lot of other jobs of

Airflow in the future. I think it's a great tool. I hope

that AWS catches up in their data pipeline, but I

always keep my fingers crossed. I think AWS, they're

always going to produce something amazing. But right

now I think Airflow is doing a great job. And especially

if you're looking for something to quickly get started

with to build your own processing. Highly suggested.

Kirill Eremenko: Okay. So tell me a bit more, how does it work? So you

have lots of data sources, you have an ETL process.

How does Airflow facilitate that?

Morgan Mendis: Okay. Yeah. So Airflow uses what's called operators.

So you might be writing scripts and writing them as

functions. It's callable. So in order to move data,

Page 36: SDS PODCAST EPISODE 321: THE LIFE OF ONE ADVANCED …career in data science. Thanks for being here today and now let's make the complex simple. Kirill Eremenko: This episode is brought

transform data, or to produce reports, right? So you

can use Airflow to send messages, you can use Airflow

to, like to send emails, like automated reports to end

users. But the key thing is that you're writing

functional scripts in Python or you could actually use

other languages too. I know it basically allows you to

use whatever language is best suited for you. So you

could write bash scripts if you wanted to. And they use

what's called these operators to then call these

functions to do these transformations.

Morgan Mendis: So Airflow also has, I believe integrations with Redshift

and Postgres and many other like popular data tools

that you can then say, "Hey, I've got this data in my

SQL database, I want to federate it across, let's say put

it into elastic search index, right? And I'm going to set

this index up. So I would say, all right, Airflow, copy

this data down, put it into CSVs for MySQL, then store

those flat files into S3 then from S3 I'm going to load

them into elastic search and or separately into

Redshift.

Kirill Eremenko: Mm-hmm (affirmative).

Morgan Mendis: Right? And each one of those tasks, moving the data

from MySQL to CSV, from the CSV to S3, from S3 to

whatever data system or storage other system you

want for analytical queering, each one of those tasks

could be a Python function and you would then have

different notes set up. And so then you can create

dependencies, right? So that one task will only execute

after successful completion of a previous task, right?

So as you imagine, there can be splits, right? Or there

could be different tasks happening in parallel and all

Page 37: SDS PODCAST EPISODE 321: THE LIFE OF ONE ADVANCED …career in data science. Thanks for being here today and now let's make the complex simple. Kirill Eremenko: This episode is brought

of that can then be represented via a directed acyclic

graph. So if you're a big fan of, network theory and

optimization as I am, you are really excited because

then you have a bunch of graphs and nodes and you

can watch them all move and execute and you just get

a little giddy about watching it all.

Kirill Eremenko: That's so cool. It sounds like a very advanced version

of SSIS that Microsoft provides.

Morgan Mendis: Yeah, I can't say I've worked at SSIS but it, yeah, it's

open source. So I think you can probably-

Kirill Eremenko: It's just like a really cool advanced ETL tool.

Morgan Mendis: Yeah.

Kirill Eremenko: Mm-hmm (affirmative).

Morgan Mendis: Yeah, exactly. Like so I said before, it's like Amazon

has their data pipeline process and it's very similar is

that you can write scripts and you can even leverage it

with Amazon Lambda functions in order to execute

different things in the same way like Lambda is just,

you're hitting different functions and you have inputs

and outputs. Similar, I just thought that it was a lot

easier to work with Airflow. It's all in one system.

Rather than going to AWS where you have to work in

their ecosystem, you have to configure everything via

JSON. The nice thing is that Airflow gives you, it'll spin

up like a flask instance so you have a web app that

you can interact with so you can turn on and off

different operation jobs. And I was actually, I'm really

impressed. One of my friends showed me that his

company wrote their own custom operators in Airflow

Page 38: SDS PODCAST EPISODE 321: THE LIFE OF ONE ADVANCED …career in data science. Thanks for being here today and now let's make the complex simple. Kirill Eremenko: This episode is brought

in which they were then able to use what they call

juper centric development.

Morgan Mendis: And this is a really interesting idea is that to make

data scientists, able to quickly iterate and prototype

and put things into production, they would just make

it so that if you write the Jupiter notebook so that

executes and builds a model or processes the data,

whichever way the data analysts and the data

scientists want, all they need to do is to have a

notebook that's clean enough that can run from start

to finish successfully and boom, you just take that

notebook, you send it to your data engineer, they put

into Airflow and you have readable code. Right?

Kirill Eremenko: Wow.

Morgan Mendis: Right. It's a beautiful thing is that you have readable

code living and working in production. You don't need

to take your code from the Jupiter notebook copy to a

PI script, wrap it in another function. No, you just

have a Jupiter notebook. It's all there.

Kirill Eremenko: Very cool. Very cool. That's the way it should be, right?

Like why should you have to go through all these

hoops and finally create potential for additional errors

where you can just, you already have the code, just

run it.

Morgan Mendis: Yeah. And again, like I said, because it's readable and

you have all the formatting and the benefits, like for

example, you might have it so that your Jupiter

notebook is going to produce some kind of

visualization of your metrics or evaluation of your

models. I'm not sure if you guys have talked previously

Page 39: SDS PODCAST EPISODE 321: THE LIFE OF ONE ADVANCED …career in data science. Thanks for being here today and now let's make the complex simple. Kirill Eremenko: This episode is brought

about using data visualization to evaluate models in

development, but if you have that in your script, then

you could quickly go into Airflow or go into your

system, look up the notebook and say, "Oh Hey, that

model that we've been, checking up on, we can see

how well it's performing with the new data via Airflow

by just checking the notebook rather than having to go

through another all these other hoops to evaluate the

model.

Kirill Eremenko: Mm-hmm (affirmative). And to validating the model,

you just mean like the lift curve and things like that.

Morgan Mendis: Yeah, you could have all of those visualizations right

there.

Kirill Eremenko: Yeah, I said that they're just updated within your data.

You don't have to rerun them specifically.

Morgan Mendis: Yeah.

Kirill Eremenko: Very cool. Very useful tool. Thanks. Thanks a lot for

the debrief on how Airflow works. It makes me like

wonder like Excel, R, Gigi Maps, SQL, Python,

Tableau, Scikit-learn, AWS, Airflow, Plotly, Plus. All

those you've already mentioned except for Plotly

probably. You've already mentioned on this podcast

once or twice or many more. Is there a limit to the

number of tools an advanced data scientists should

know? Are you just like, just keep picking up new ones

all the time?

Morgan Mendis: No, I think, yeah, you've got to constantly be learning.

like I didn't even touch on like using spark or anything

like that or trying, like right now I'm very interested in

Page 40: SDS PODCAST EPISODE 321: THE LIFE OF ONE ADVANCED …career in data science. Thanks for being here today and now let's make the complex simple. Kirill Eremenko: This episode is brought

learning Scala just because I'm like, "Oh, this is going

to be a great language." Then maybe move on to Kotlin

or something else. But I think that as you learn new

tools, you think of problems in a different way and I

think that's the key thing is that, for example, I love,

as I mentioned earlier, I love learning languages

because I like talking to people. But I also know that

what's really interesting in learning languages is that

you start thinking in a different way. You start

thinking culturally or you start thinking as you

approach people in a different way than you might

have thought before.

Morgan Mendis: There's a popular theory, I don't know actually how

popular it is, but it's called Whorf Theory which from

linguistics, which says that you are limited to what

you can think based on the language that you know.

Right? So for example, we might not be able to think of

a certain solution because we don't have the words to

envision it or describe it. So again, I think I was

actually listening one of your previous podcasts about,

the future of AI and I thought this was good as you

guys were mentioning, can you explain an ant

language to a monkey, I mean, can you explain in ant

language what a monkey is to an ant? Right? And I

think that that's a key thing is that as you learn new

tools and languages, you're going to start thinking

about problems in a different way. Right?

Morgan Mendis: And I don't think it's about, like I said, I still go back

to Excel to do certain tasks, not because I'm like, "Oh,

Excel is the most advanced tool." But instead I tell

people, I was like, "Instead of learning necessarily all of

Page 41: SDS PODCAST EPISODE 321: THE LIFE OF ONE ADVANCED …career in data science. Thanks for being here today and now let's make the complex simple. Kirill Eremenko: This episode is brought

these advanced tools, it's like have you mastered

Excel? Because Excel might be the key tool you need

to be successful or for your organization to find value

in the work you're doing."

Kirill Eremenko: Absolutely. Absolutely. And what languages do you

speak out of curiosity?

Morgan Mendis: So English, Spanish and Portuguese. And I'm working

now on my French and Haitian Creole.

Kirill Eremenko: Nice. Very nice. And so what I wanted to like, to your

point, this is a great example that sometimes you

cannot think of a solution, just because we are limited

by the language we speak. So I was born in Russia and

I speak Russian. And like recently I've been interested

in Eastern European languages. Now I'm learning

Spanish as well and I've found this very peculiar

phenomenon that for example, in Russian, we don't

have a separate word for hands and arms, we just

have one word, ruka. And that means hand and arms.

So we'll would be saying like, instead of shaking your

hands in English, which would translate to shake your

arms. We don't have a separate word for feet versus

legs. We just have one word called nega. Like put your

shoes on your Russian feet would sound in English,

put your shoes on your legs. And things like that.

Kirill Eremenko: So, and that is the same in Czech language and Polish

language in as far as I know in some other Eastern

European languages there's just no separate word to

distinguish between arm versus hand, foot versus leg.

And I think maybe like, I'm not sure you might

comment on this like, is that the same in Spanish or

Page 42: SDS PODCAST EPISODE 321: THE LIFE OF ONE ADVANCED …career in data science. Thanks for being here today and now let's make the complex simple. Kirill Eremenko: This episode is brought

not? But like it just shows that's like there are certain

limitations as you say. And that also as I can see how

you're making this a point that in data science as well,

the more tools you know Python, R, Excel, Tableau,

whatever it is, AWS, Scala, Spark, the more

opportunities you have to think of different solutions.

Morgan Mendis: Yeah. I think that that's the key thing is also culturally

relevant. Right? I don't know if in Russian it's popular

to be obviously I think you do shake hands obviously

how popular it is called truly to think about these

things. I was talking to one of my colleagues yesterday

about the idea that in some languages they don't have,

like they don't have left and right. They only think in

cardinal directions.

Kirill Eremenko: Interesting.

Morgan Mendis: Right? So yeah, when they talk about something they

use like the idea of East to West, right? To describe

progression or as time booze. Right? Or in a story. And

it's like really interesting to think about that. And I

think it also goes into also like what's most relevant to

a given culture, to a given group of people. Right? Like

we know that in some cultures, some native American

cultures in Alaska, they have multiple, they have over

10 words for snow, right? To describe all the different

types of snow. Right? And for as English speakers,

we're just like "Hey, there's snow." You get snow.

Right?

Kirill Eremenko: Yeah.

Morgan Mendis: I actually am wondering, I'm like, I wonder what the

Creole word is for snow right now.

Page 43: SDS PODCAST EPISODE 321: THE LIFE OF ONE ADVANCED …career in data science. Thanks for being here today and now let's make the complex simple. Kirill Eremenko: This episode is brought

Kirill Eremenko: Yeah.

Morgan Mendis: Because I don't know how often they get to see it.

Kirill Eremenko: Yeah. That's right. Wow. Wow. Very cool. And in

addition to like the words or use cases in the case of

data science, so for different languages, the

programming languages in this case, in addition to

that variety, that enriches your problem solving

abilities. It also enhances the neural pathways in your

brain. If you keep speaking in Spanish and then in

English you're going to use different neural pathways

or slightly different neural pathways. The greater the

variance between the languages, the greater you're

going to have to engage your brain. Sometimes like

learning a language, sometimes I'm sitting there, my

head actually hurts because I feel that something is

changing. I have to overcome these long neural

pathways and new ones have to be formed for me to

think faster in your language. Same thing with

programming languages. Same thing with all these

tools that we use. The more than we use, the more

neural pathways I believe your brain going to develop

in order for you to use them faster, and that's going to

help you, aid you in coming up with solutions faster as

well.

Morgan Mendis: Yeah, no. Honestly, that's I think some of the most

exciting thing if you're a forever learner is that you

know when you're struggling that, "Oh wait, things are

changing in your head." right? You've got to maybe

throw out this, this previous limitation because there

was no connection points though that neural pathway,

right? There was no connection point, but now

Page 44: SDS PODCAST EPISODE 321: THE LIFE OF ONE ADVANCED …career in data science. Thanks for being here today and now let's make the complex simple. Kirill Eremenko: This episode is brought

because you've added in this language or you've added

in this concept, it allows you to rethink things and it's

like, "Oh wait, now I can send information down this

way. I can send my elec. Right? I can send those

neural electrodes down that way so I can think of it

this way."

Kirill Eremenko: Yeah.

Morgan Mendis: Oh, I think that that's really, really exciting. And I

mean that's why I think learning languages is a crucial

thing to help people conceive of new problems. And as

you're talking about like in Spanish, one thing that I'm

always kind of reminded about when I'm speaking with

a native Spanish speaker is sometimes my expressions

of like, "Oh me encanta eso." They're like, wait, why is

it that you don't have such a strong, affection towards

such an item or affinity towards something. And I

don't realize it because in English we say love to

anything. "Oh I love this coffee." But you wouldn't say

that in Spanish or in French because they're like, that

word has a lot of significance. It has a lot of meaning

behind it and it makes you start thinking a little bit

more about the word choices.

Morgan Mendis: And it's actually sometimes interesting to hear people

multilingual people speak because I think that they

have a different perception of words potentially. And

they're more sensitive times than people who are, only

speaking one language because they're like, "This is

the word that we know and it's common

understanding." But people are like, "No, it's not that

common." It's all about context and place. And it's the

same way with data science tools. For example, me

Page 45: SDS PODCAST EPISODE 321: THE LIFE OF ONE ADVANCED …career in data science. Thanks for being here today and now let's make the complex simple. Kirill Eremenko: This episode is brought

using Python might be advanced to somebody who's

only using Excel. Right? But me using Python might

be, trivial to somebody who's only working in Scala or

who knows, only working in Julia down the road or...

Kirill Eremenko: Yeah. Yeah, I totally get what you mean. Yeah, it's all

very relative and the more you explore, the more

variety, the more things you can see about your past

experiences, your future experiences, how others are

working on tools. I think it's a very exciting, exciting

thing to constantly be learning. And it's great to hear

from you, like, because obviously at the start of your

career, it's very exciting to be learning. But in your

situation, I know you say you don't think of yourself as

a highly advanced data scientist. I really think of you

as a very advanced data scientist. I think from what

we discussed on this podcast, our listeners will agree

on. It's very inspiring to hear that coming from you

that even once you've accomplished so much and that

you're now pursuing your dreams and passion projects

and things like that, you're still very excited about

learning. So that never stops. And I truly admire all of

that. So thank you a lot for that inspiration in just the

way that you approach data science yourself.

Morgan Mendis: I appreciate the compliments, but I hope that I can live

up to them.

Kirill Eremenko: For sure. And so unfortunately we're running out of

time. And just before we wrap up, I wanted to do one

more shout out to this amazing undertaking that

you're doing in Haiti, Ayiti Analytics and how you're

changing or bring up this new generation of data

scientists who are going to be changing a lots of things

Page 46: SDS PODCAST EPISODE 321: THE LIFE OF ONE ADVANCED …career in data science. Thanks for being here today and now let's make the complex simple. Kirill Eremenko: This episode is brought

locally and in the world and bringing good to the

world. How can people support this course? Is there

anything that our listeners can do in order to

participate or just spread the word about what you're

doing in Haiti?

Morgan Mendis: Yeah, definitely take a look at ayitianalytics.org, which

I'm sure Kirill is going to have posted on this podcast

description, but we're definitely looking for

collaborators in the U.S., in Europe, in Africa

especially as well. We're really trying to advance data

science in the country and we want to find other

practitioners, other advanced data scientists who are

willing to give back in terms of their time to help

others grow, especially in giving opportunities for

people who are potentially going to interact with

exciting data to help develop their own country in a

way. And so we're definitely looking for collaborators,

mentors, especially if you speak Haitian, Creole and

French. We're also looking for people to help us in

terms of translating some of the content that is online

into the local language.

Morgan Mendis: But yeah, if you just want to get involved and talk with

some of the students and other practitioners in our

group, that is also, very welcome. So we hope that

people are interested and they want to collaborate and

they want to give back. And I think that that's a key

thing is that we're an organization that wants to use

data science for the benefit of everybody and we are

really looking for collaborators to come in and try to

help us with that.

Page 47: SDS PODCAST EPISODE 321: THE LIFE OF ONE ADVANCED …career in data science. Thanks for being here today and now let's make the complex simple. Kirill Eremenko: This episode is brought

Kirill Eremenko: Fantastic. Thank you so much for sharing. And I will

share the website link in the show notes, but just do

mention it here. It's Ayiti Analytics and Ayiti is spelled

A-Y-I-T-I. As you mentioned it's a play on words and

how Haiti pronounced, right?

Morgan Mendis: Yeah. The phonetic pronunciation.

Kirill Eremenko: Of Haiti. Okay. Awesome. So ayitinalytics.org if you

want to check out. Morgan, once again, thank you so

much. How can our listeners get in touch with you

and you said LinkedIn, right? Is the best way to get in

touch with you?

Morgan Mendis: Yeah, and if they want to also send me an email

directly, they can also hit me up at

[email protected].

Kirill Eremenko: Mm-hmm (affirmative). Awesome, awesome. So

definitely people get in touch. Morgan, you're doing

some fantastic work. Before I let you go, one more final

question for you. What's a book that you can

recommend to our listeners to help Inspire their

careers?

Morgan Mendis: I'm going to hit you back with two books actually.

Kirill Eremenko: Nice.

Morgan Mendis: One book that I think transformed my life in data

science early on was a book called Manga's Guide to

Databases.

Kirill Eremenko: Mm-hmm (affirmative).

Morgan Mendis: So go out, buy this book, give it to, you can give it to

somebody who's in middle school. They can learn

Page 48: SDS PODCAST EPISODE 321: THE LIFE OF ONE ADVANCED …career in data science. Thanks for being here today and now let's make the complex simple. Kirill Eremenko: This episode is brought

databases, they can learn SQL. They can understand

how it works from this book. This is how I learned

SQL. I learned SQL on the fly at an internship via this

book, 100% believe in it. It's really, really easy to learn

and right. It's almost for any level of reader. The other

book is, I really want to push some human centered

design, so it's The Design of Everyday Things. It's

another great book to give you an understanding into

how you can use design principles in whatever you're

hoping to do. Whatever it is, you can always benefit

from, interacting with your clients or interacting with

your stakeholders and really understanding how you

can afford them greater value. So The Design of

Everyday Things.

Kirill Eremenko: Fantastic. Thank you so much. So Manga's Guide to

Databases and The Design of Everyday Things.

Morgan, once again, thank you so much for coming on

the show and sharing all your insights, inspiration

with us and also just showing us what's it like to live

the life of an advanced data scientist. I think we can

all aspire towards that and it's really cool to see how

you follow your dreams and hopefully those of us who

can help, I insist we will get in touch and assist you in

pursuing your passions and making a difference in

other people's lives in data science. Thank you so

much for being here today.

Morgan Mendis: Thanks for having me Kirill.

Kirill Eremenko: So there you have it ladies and gentlemen, that was

Morgan Mendis and thank you so much for being part

of our conversation here today. I know we went a bit

over but I hope you enjoyed it as much as I did and got

Page 49: SDS PODCAST EPISODE 321: THE LIFE OF ONE ADVANCED …career in data science. Thanks for being here today and now let's make the complex simple. Kirill Eremenko: This episode is brought

as many useful takeaways from the conversation. For

me, the biggest takeaway in terms of contributing to

society was of course what Morgan is doing with Ayiti

Analytics. So if you can help in any way, please get in

touch. I've already spoken to Morgan. I've told and, as

I mentioned on the podcast, we're going to supply as

many SuperDataScience courses as needed, absolutely

free of charge to this cause in order to help get

Morgan's team and the people that they're coaching up

to speed with the concepts of data science, machine

learning, artificial intelligence.

Kirill Eremenko: We just counted this out to the podcast. We have

about four courses that we already have in French and

we are going to be supplying these to Ayiti Analytics in

order for them, as they need these courses, in order for

them to help bring people up to speed with data

science and make an impact in the world. So get

involved if you can. The website is ayitinalytics.org and

we'll definitely share that on the show notes and in

terms of technical aspects. My third takeaway was

Airflow and the whole concept of this advanced ETL

tool. I was very excited to hear about that. In fact, we

spoke with Morgan after the podcast and I invited

Morgan to come and present at DataScienceGO 2020

and he agreed.

Kirill Eremenko: So Morgan is going to be presenting at DSGO 2020.

We ready to have the topic, it's most likely going to be

Airflow and visual diagnostics for machine learning is

going to be an advanced workshop, a workshop for the

advanced track of our advanced practitioners. So if

you can make it to DataScienceGO, you will get to see

Page 50: SDS PODCAST EPISODE 321: THE LIFE OF ONE ADVANCED …career in data science. Thanks for being here today and now let's make the complex simple. Kirill Eremenko: This episode is brought

Morgan there and perhaps even attend his workshop.

The dates are 6th, 7th, 8th of November. So as usual

you already by now you can get your tickets

datasciencego.com.

Kirill Eremenko: So that is our episode for today. As usual, you can get

all the materials that we mentioned, any links to

Morgan's LinkedIn to his email, to the projects that

he's working on. Also to the position that we talked

about the VP of data science, you can get all of those

things at the show notes at

www.superdatscience.com/321. That's

superdatascience.com/321. Check it out and get what

you are most interested in, what you are most curious

about.

Kirill Eremenko: And of course hit Morgan up, connect with him on

LinkedIn. It's great to have advanced data scientists in

your network, people who you can come to with

questions, ask about their careers or just follow their

careers and see where it takes them. It's always

inspiring to see what an advanced data scientist, how

they choose to structure their career going forward.

Kirill Eremenko: On that note, once again, thank you so much for being

here today. If you'd like to meet Morgan, check out

datasciencego.com and I look forward to seeing you

back here next time. Until then, happy analyzing.