13
Open Source @ IBM R Community Update Augustina Ragwitz, IBM Cognitive Open Tech August, 2017

Open Source @ IBM R Community Update...Open Source @ IBM R Community Update Augustina Ragwitz, IBM Cognitive Open Tech August, 2017

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Open Source @ IBM R Community Update...Open Source @ IBM R Community Update Augustina Ragwitz, IBM Cognitive Open Tech August, 2017

Open Source @ IBMR Community UpdateAugustina Ragwitz, IBM Cognitive Open Tech

August, 2017

Page 2: Open Source @ IBM R Community Update...Open Source @ IBM R Community Update Augustina Ragwitz, IBM Cognitive Open Tech August, 2017

2

Today's Agenda

• What is R?

• How was R created?

• Where is the R community?

• Technical overview

• What's the current status of R?

• What is next for the R community?

• R at IBM

• Let's Get Started

• Call to Action

Page 3: Open Source @ IBM R Community Update...Open Source @ IBM R Community Update Augustina Ragwitz, IBM Cognitive Open Tech August, 2017

3

What is R?

• Free Open Source alternative to Stata, Matlab, SPSS, and SAS

• Preferred by students because it is free; enter workforce alreadytrained in it (Fast Company, 2014)

• Developed by and for researchers and analysts that do not havea traditional programming background

• Easy to use; low overhead to get up and running

• Extensible through packages via CRAN, BioConductor, andROpenSci.

R is a programming environment for statistical analysis + graphics

Page 4: Open Source @ IBM R Community Update...Open Source @ IBM R Community Update Augustina Ragwitz, IBM Cognitive Open Tech August, 2017

4

How was R created?

• Ross Ihaka and Robert Gentleman(University of Auckland, NZ) in 1992

• First Stable beta: 2000

• Annual x.y.0 releases in Spring• Patches released as needed (x.y.z)

• Final patch release of previous versionjust the new one

• Current major version: 3.0.0

• Learn more about R core Internals: https://cran.r-project.org/doc/manuals/r-release/R-ints.html

Created by Statisticians for Statisticians

Page 5: Open Source @ IBM R Community Update...Open Source @ IBM R Community Update Augustina Ragwitz, IBM Cognitive Open Tech August, 2017

5

Where is the R community?

• CRAN – R Package Repository• https://cran.r-project.org/• User-submitted R code to

extend the R language

• R Foundation• https://www.r-project.org/foundation/• Support R community

• R Consortium• https://www.r-consortium.org/• Founded in 2015• Bridge Community and Enterprise

Interests• Platinum Companies include

IBM, Microsoft, RStudio• IBM: Board + Steering and

Marketing Committees

Page 6: Open Source @ IBM R Community Update...Open Source @ IBM R Community Update Augustina Ragwitz, IBM Cognitive Open Tech August, 2017

6

R: Analyst Technical Overview

Data gathering to analysis to publishing streamlined!

• Convert unstructured datainto tables (readr, tidyr)

• One-liner statisticalanalysis (dplyr)

• Easy data visualization(ggplot2)

• Reactive JavaScript appgenerated from R code (shiny)

• Publish research + code inHTML, PDF, and otherformats (rmarkdown, knitr)

# gathermy_data <- read_csv("my_data.csv")df <- as_data_frame(my_data)

# summarizedf <- df %>%filter(!is.na(name)) %>%separate(name, c("last", "first"), sep=",") %>%group_by(last) %>%summarise(total=n())

Page 7: Open Source @ IBM R Community Update...Open Source @ IBM R Community Update Augustina Ragwitz, IBM Cognitive Open Tech August, 2017

7

R: Developer Technical Overview

Integrating R into Production Workflows

Rserve provides a socket interface to existing applications> install.packages("Rserve")> library(Rserve)> Rserve()

Plumber generates API endpoints from R code for Rserve#* @post /sumaddTwo <- function(a, b){ as.numeric(a) + as.numeric(b) }

https://www.rplumber.io/

Page 8: Open Source @ IBM R Community Update...Open Source @ IBM R Community Update Augustina Ragwitz, IBM Cognitive Open Tech August, 2017

8

What is the current status of R? Over 11,000 packages on CRAN!

Most popular open source tool in academic research

Top language among industry data scientists

Page 9: Open Source @ IBM R Community Update...Open Source @ IBM R Community Update Augustina Ragwitz, IBM Cognitive Open Tech August, 2017

9

What's next for the R community?

• Improve Tooling and Support• Code Coverage• RHub (hosted testing + validation of R packages)

• Big Data and Cloud Improvements• Unified Framework/API for Distributed Computing• Better database integration via DBI• Support scalable Spatiotemporal/raster datasets

• Community Training and Outreach• Software Carpentry/Data Carpentry workshops• Support R User Groups (RUGs)• Diversity Initiatives (R-Ladies)

Page 10: Open Source @ IBM R Community Update...Open Source @ IBM R Community Update Augustina Ragwitz, IBM Cognitive Open Tech August, 2017

10

R at IBM

• Learn R and Data Science through CognitiveClass.AI• Data Science with R: https://cognitiveclass.ai/learn/data-science-r/

• Data Science Experience + R• RStudio: https://datascience.ibm.com/docs/content/analyze-data/rstudio-overview.html

• R + Watson Natural Language Understanding(NLU): https://apsportal.ibm.com/exchange/public/entry/view/1015c435b898fb629a7e7523be151aed

• DeveloperWorks Code• https://developer.ibm.com/code/patterns/detect-change-points-in-iot-sensor-data/

• https://developer.ibm.com/code/patterns/category/data-science/

Page 11: Open Source @ IBM R Community Update...Open Source @ IBM R Community Update Augustina Ragwitz, IBM Cognitive Open Tech August, 2017

11

R: Let's Get Started

• Install

• CRAN - https://cran.r-project.org/

• RStudio - https://www.rstudio.com/

• Learn

• Install the swirl package - http://swirlstats.com/

• R for Data Science by Hadley Wickham - http://r4ds.had.co.nz/

• Statistics for R course on Coursera (free to audit) - https://www.coursera.org/specializations/statistics

• Explore

• Big Data Analysis + Machine Learning with R + Apache Spark

• R4ML: https://www.ibm.com/support/knowledgecenter/en/SSPT3X_4.2.5/com.ibm.swg.im.infosphere.biginsights.tut.doc/doc/tut_Mod_R4ML.html

• Data Science for Automotive Lab (R in Jupyter Notebook)

• https://github.com/kurlare/DSforAutomotive

Page 12: Open Source @ IBM R Community Update...Open Source @ IBM R Community Update Augustina Ragwitz, IBM Cognitive Open Tech August, 2017

• Find a RUG Meetup• https://www.meetup.com/topics/r-project-for-statistical-computing/

• Attend an R conference• useR, EARL, RStudio::conf, Open Data Science West/East

• Use Twitter hashtag #rstats

• Join a Mailing List• https://www.r-project.org/mail.html

• Submit a community proposal• https://www.r-consortium.org/projects/call-for-proposals

• Join a Working Group• https://www.r-consortium.org/projects/isc-working-groups

• Subscribe to the IBM Code monthly newsletter (hotlink:https://www.pages03.net/ibmdeveloperworks/developerWorks-IBMCodeNewsletterSubscriptionPage-secure/)

• Subscribe to future Code Tech Talks (hotlink:https://www.pages03.net/ibmdeveloperworks/developerWorks-IBMCodeTechTalkSubscriptionPage-secure/)

Call to Action

Page 13: Open Source @ IBM R Community Update...Open Source @ IBM R Community Update Augustina Ragwitz, IBM Cognitive Open Tech August, 2017

Q & A

13