Visualizing the Nike+ FuelBand Geographically

Visualizing the Nike+ FuelBand GeographicallyTimothy Shi, Mitchell Fukumoto, Wesley Leung

Computer ScienceStanford University

{timshi | mitchellfukumoto | wes}@cs.stanford.edu

ABSTRACTNike currently has millions of users recording hundreds of millions of statistics from their new product the Nike FuelBand. The FuelBand awards points for energy expenditure and provides a universal means of comparing physical activity across individuals. We wanted to concisely and accurately portray to Nike where users across the nation were generating NikeFuel points. We sought to do so through heat maps of selected areas using the Google Maps API with more detailed information and aggregates delivered via D3. Our final implementation is Nike+Stanford, a visualization tool supported by Nike’s large data set that allows users to graphically represent fuel point data over a span of time in a geographical region.

Author KeywordsHCI, Stanford, Nike, Google Maps, Nike+, Fitness, FuelBand.

ACM Classification KeywordsH5.2. Graphical User Interfaces.

INTRODUCTION

“If you have a body, you are an athlete.” Bill Bowerman, Nike Co-founder

The Nike FuelBand debuted in February 2012 with this quote in mind. The Nike FuelBand aims to measure physical activity performed by its users by assigning users NikeFuel points for the energy they expend regardless of their height, weight, age, etc.; a fuel point is a universal measure of physical activity. Users can then track their physical activity and set daily goals for themselves.

One problem that Nike currently faces is the integration of fuel points with its existing platform for tracking physical

activity - the Nike+ running application. Users were able to visualize their runs on a map tracked via GPS, but the visualization of fuel points and running data was synthesized even though they were related concepts. This was the original problem we attempted to solve.

We worked closely with Nike who graciously provided us a 2 terabyte data dump of their own databases. However, many problems arose from dealing with a 2 terabyte data set. While hard drive space was an obvious factor, data imports through Oracle, data migration to a server, and serving the data from a server in a reasonable time were all problems that we did not anticipate. Because of these limitations, we were forced to pivot from creating a user-centric application to a Nike-centric one.

With the FuelBand and fuel points being less than a year old, Nike currently does not have good ways of visualizing the multitudes of data they are collecting on their users. We thought that a key component missing from the Nike FuelBand data was a simple map of where fuel points were being generated. This would not only tell Nike where the most fuel points were being generated, but could also speak to the sales and marketing of the FuelBand nationally.

RELATED WORKSince the launch of Nike+ and nikeplus.com in 2006, Nike has been a leader in innovating new means of tracking and visualizing personal fitness data. The original Nike+ site began first launched with the ability to show users simple stats about their running history such as average pace, total distance, & calories burned and has grown tremendously in the last 6 years. Armed with GPS data from the Nike+ Running App (available on iOS and Android), Nike+ is able to map a user’s run history, plot elevation changes along a run, and show popular community running routes in a heat map generated by averaging all runs together in their database. And with the launch of the FuelBand in February 2012, Nike+ is now capable of showing users their complete activity history throughout the day in the form of line graphs and histograms. Altogether the Nike+ site provides users with a detailed look into their fitness history, most importantly providing them with insights about their progress through aggregations of all of the data Nike has collected. Nike is uniquely positioned because of the sheer size of its dataset, and it has been a pioneer in the fitness space by democratizing this data with the single-focused goal of helping the Nike+ global community to run and exercise more often.

1

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.Copyright 2012 Shi, Fukumoto, Leung, Nike, Inc.

Other than Nike, there have been a number of applications ranging from startup products to research that leverage visualization for analyzing physical fitness. One example is Fitster, a social “visualization interface that supports fitness motivation,” which employs a data recorded from a pedometer for fitness activity tracking and goal setting (5). Fitster aims to address the growing problem of sedentary activity by using gamification and competition to motivate users. Much like Nike, the application is a step forward in allowing users to not only quantify their fitness but also use a graphical displays to find patterns and make physical activity patterns visible. Other products such as Fitbit aim to accomplish the same goals of collecting fitness data and developing a story with it for helping user’s improve their physical fitness (2).Determining what data to show in a visual display for fitness is a promising but also risky endeavor. As illustrated by a Intel Corporation HCI research, “visualization designers must incorporate and display multiple measures of progress, and situate them sensitively in a wider social context” (4). The research finds that too much visualized information can be de-motivating while not having enough could lead to potentially beneficial data to go unnoticed. This research was particularly relevant in helping our team realize that simply displaying data is not always helpful; with regard to physical fitness, we need to keep social implications and abstract away information that can be detrimental.

IMPLEMENTATIONWe implemented a front-end GUI as well as a server and an in-house set of APIs for processing Nike+’s data in order to be serviced to the user. The following two sections describe the separate front-end and backend components.

Front-end User InterfaceOur application allows for simple exploration and searching of data on a map. It primarily consists of three components: a map canvas, a user interface tray, and a bar graph.The front-end is built on top of Google’s Maps API, which already implements tools for constructing map overlays. In

conjunction with Google maps, we leveraged elements of Twitter bootstrap, a front-end toolkit framework that uses pre-built web templates for rapid content tray and slider creation. To construct the bar graph visualizations, we used D3.js a Javascript-based graphing framework to bind data to a Document Object Model (DOM) and create elements on the SVG canvas for users to see. The remaining web elements were implemented using jQuery UI, a javascript user interface library.From the main page, the user has two methods to locate a region they are interested in exploring; they can either pan and zoom using Google’s built-in map canvas or use a search bar in the upper right hand side to locate a region based on zip-code, address, or general key terms. After locating the desired-region, the user can click the “play” icon or open up the bottom tray to further refine their query based on the date range of the data they want to fetch. Performing these actions triggers our application’s main visualization, which calls our server.

The NikeFuel point visualization features a heat map that plots the densities of fuel points over the Google map. To generate the visualization, we retrieve the latitude & longitude bounds of the user’s current viewport as well as their specified date range to our server. When the user clicks the “play” button, a progress bar on the tray updates so that the user can have a rough idea for how long their requested data takes to process. Upon completion, our APIs filter the Nike+ data based on user inputs and returns a JSON formatted response.Each instance of a user’s physical activity in the JSON object contains the number of fuel points earned as well as the zip code of their residence. Because the Nike+ FuelBand does not collect GPS data, we assume that the user’s fuel points are gained at their residence. In order to plot this information on the map, we use reverse geocoding to produce longitude and latitude coordinates and assign the fuel point figure to the weight of the heat map node. Google’s Map tools handles the heavy-lifting for drawing the points on a map. On our end, we only need to feed Google fuel points earned at each zip code, which are automatically scaled on an RGB scale.

Locations that have a higher aggregate number of fuel points will be darker -- in our case, closer to a red color -- while areas with less recorded physical activity will be marked with a lighter color -- green, in our implementation.Additionally, the front-end collects aggregate information about the set of Nike+ activity, which are represented in our visualization by a bar graph of total fuel versus time. The D3.js framework allows us to rapidly graph the aggregate amount of fuel points earned per day in the mapped region. On hover of a stack, a user can view the date and total fuel points earned by all members in the current map view. Together, both the map and bar graph UI tell the story of how much fuel points are being acquired with respect to location and time. The user can make additional queries by

2

Figure 1. Figure captions should be centered and placed below the figure.

Fitness tracking apps such as Fitster have emerged in recent years

moving the map to a new location of interest and repeating the search progress.

Data Wrangling and Optimization StrategiesTo build an application that could quickly and reliably query large amounts of data we turned to a combination of products on Heroku’s (www.heroku.com) platform as a service (PaaS), leveraging their Django platform and shared Postgres service to satisfy our needs. Overall we had two major concerns, efficiency and price. Using Heroku’s free developer application and database we were able to create a speedy application and database layer that could handle our 10 million row dataset, although doing so required some crafty resourcefulness." At the database layer we chose Heroku Postgres as the optimal solution because of Postgres’ aforementioned superior database qualities and for Heroku’s guaranteed management and service uptime. Because of the general speediness of the database and service, we were able to provision and configure our system, then run our conversion scripts from the Oracle CSV exports to Postgres in a mere 12 hours, far fewer than the days it took to configure, import, and export Oracle data. To ensure that this process ran smoothly, we decided to run everything on Heroku’s servers by hosting the CSVs on our Stanford AFS drives and wrote Python scripts to run on Heroku that could download, parse, and import our CSVs into the hosted Postgres database. Once we completed that process, we knew we had a solid solution (with daily backups included) that could reliably service our needs to process big data.

" To bridge the gap between our database and the front-end application, we built our application layer on Heroku’s hosted Django platform for its speed, managed object relational model (ORM), and our familiarity with the language and platform. By leveraging existing Django resources in the open source community such as Django-Skel (https://github.com/rdegges/django-skel) we succeeded in connecting a running application to our Heroku database layer in a couple of hours, with uptime guaranteed by Heroku. With Django, our model representing Nike users and activity records directly hooked up to our database schema, allowing us to efficiently query Postgres without having to write complicated SQL. Additionally, Python’s lightweight nature gave us the ability to build massive JSON dictionaries with 10000+ entries on a small and efficient memory footprint (~50MBs). " Although we were lucky to receive fast machines for free through Heroku’s developer program, we still needed to make sure to build efficient processes for managing, querying, and delivering the data to the client. Out of the different optimizations across the various layers of our visualization system, by far the most impactful were the efficiency gains made by targeting indexing of our Postgres database. One of the more important reasons for choosing Postgres was the ease and speed with which we

could create indices using psql, the built in command line interface.

CREATE INDEX! fuelmapper_nikesportactivity_postal_code_index ON! fuelmapper_nikesportactivity(postal_code NULLS ! LAST);CREATE INDEX! fuelmapper_postalcode_lng_indexON ! fuelmapper_postalcode(lng NULLS LAST);CREATE INDEX! fuelmapper_postalcode_lat_indexON ! fuelmapper_postalcode(lag NULLS LAST);

These three lines alone resulted in a 1600% decrease in time spent in the database, significantly improving the responsiveness of our application and allowing us to query and deliver more data points at a time. After these optimizations, our application went from reliably serving 100 lines from the database in a request to easily handling 10,000+.

After optimizing at the database layer, we still had significant room for improvement at the application layer to increase cache hits and reduce reliance on the database. Initially to deliver location data we looked up postal code coordinates for every activity, resulting in thousands of unnecessary database disk hits for each request that needed to be removed. This was achieved by taking advantage of Python’s low memory overhead to save objects like our “PostalCode” model in memory to essentially eliminate database hits to look up postal code regions for drawing purposes.

3

Figure 2. Significant reduction in app server response time due to increases in database efficiency.

Figure 3. Reduced database operations after optimizing the application layer.

http://www.heroku.com

http://www.heroku.com

https://github.com/rdegges/django-skel

https://github.com/rdegges/django-skel

A final weak point eliminated through being scrappy and resourceful was our reliance on external services to deliver the geographic data on postal codes that the application required to draw fuel data on the map appropriately. Our initial reliance on GeoNames (www.geonames.org) to look up postal codes by geographic coordinates and the Google Maps Geocoding API (http://maps.googleapis.com/maps/api/geocode/json) to deliver geographic boundaries took far too long and often ran us into API rate limits that would crash our service. To remedy this, we wrote additional database loading scripts to ping the above APIs and save their results to our optimized database for quick processing.

RESULTSIn our demonstrations to users while testing and presenting, we received strong positive feedback and cleanliness and responsiveness of the front-end of our application, but many questions on the slow data response and the implications of our data. Ultimately our data delivery to the page was a slow process (20+ seconds) compared to initial page rendering (~1 second). This disconnect confused users as the quick page load gave them the expectation that all interactions would be snappy. Users were additionally confused, when shown the visualization without prior explanation, by the usefulness of visualizing data on raw fuel generation. Many asked questions like, “How might this help the average person? “, “What does Nike get out of this?” To remedy this and take the visualization further, the app will need: a) a means of explaining to users on takeaways they might find from the data and b) to scale and normalize the data used to generate heat maps by metrics such as number of users or total population to give Nike a better business case for leveraging the visualization.

DISCUSSIONWe think that Nike can gain a lot of useful information by exploring our data visualization. Although Nike may have internal tools to monitor sales and marketing strategies, we believe our visualization can revolutionize how large companies view and measure the success of their products.

Because of the limitations of our server, we can make two types of queries to our database - visualizing a few cities

over many days or a couple states for a single day. Both of these queries yielded interesting results. For example, a visualization of the west coast shows that the residents of Salt Lake City either purchase more FuelBands than or are more active than residents of Phoenix (see Fig. 4). Despite the fact that Salt Lake City’s population is 83% smaller than Phoenix’s population, their heat map is of essentially the same intensity. This finding was consistent when selecting a date range near the FuelBand’s release and when selecting a date range a few months after release of the FuelBand.When zoomed in on a few cities within a state, more detailed findings can be revealed for that state. While Sunnyvale’s population is more than twice that of Palo Alto, its heat map is very dim in comparison (see Fig. 5). This may be due to increased sales of the FuelBand in the more affluent Palo Alto neighborhood, less physical activity by residents of Sunnyvale or a combination of the two. Interesting findings can also be revealed by observing the bar graph at the bottom of the visualization. In all visualizations of the FuelBand’s early days, not a lot of fuel points were being generated probably due to lack of units sold (see Fig. 5). However, there are definite spikes in fuel activity that occur in multiple regions of the United States on the same days. We hypothesize that this was either due to a marketing campaign or an increase in the number of FuelBands produced the keep up with consumer demand. Lastly, mapping select geographical regions during different time spans is a great way to yield interesting

4

Figure 4. Heatmap of California generated by nikeplusstanford.herokuapp.com

Figure 5. Heatmap of Silicon Valley generated by nikeplusstanford.herokuapp.com

Figure 6. Heatmap the greater Portland area comparing 02/12-04/12 to 03/12-06/12

http://www.geonames.org

http://www.geonames.org

http://maps.googleapis.com/maps/api/geocode/json




results. The left graph in figure 6 shows a fuel map of Portland and Beaverton where Nike is located shortly after the FuelBand was released. Curiously, the right graph in figure 6 shows the same area about a month later. There is a clear influx of user activity in almost all locations - a good sign that Nike’s marketing is doing a great job in its home state.

Although many of these findings might not be revolutionary news to Nike, it is presented in a way that we believe Nike has not seen or thought of before. It abstracts away the spreadsheets, tables and graphs of data that Nike’s marketing and sales team certainly uses, and organizes it concisely in terms that anyone can understand - here is a map and here is where users are buying and using the FuelBand.One unique benefit of our visualization is that it could be used by Nike as a way to further market the FuelBand. Social influence is one way to convince a potential buyer to purchase a product. For example, Nike could show our visualization to a potential buyer in Palo Alto. Because of the density of fuel point expenditure there, the marketing pitch could be based on overall fitness in the Palo Alto area or simply the “everybody’s doing it” social marketing pitch.

FUTURE WORKNike+ FuelBand data is awash with fascinating information with regard to who’s using Nike’s product and what they are doing to stay physically active. Our application takes the first step in collecting aggregate information about these users and visualizing it in an easily navigable way with regard to location and time. Our tool is by no means finished, however, as there are a plethora of technical improvements as well as additional ways to explore this data.

Technical ImprovementsThe most salient changes for our team would involve figuring out additional ways to wrangle the Nike dataset and develop new optimization strategies so that the user experience on the front-end can be as smooth as possible. Although database indexing has allowed us to deliver over 10,000 data points to the browser in a few seconds, the process still has inefficiencies that slow down the user experience. Due to budget limitations, we had to more resourceful with regard to how we processed massive amounts of data rather than looking at solutions such as Memcache, which would have sped up data collection from our API calls. Additional improvements that could be made without external tools, for example, could involve paging requests into smaller batches and incrementally drawing heat-map points to boost perceived speed from the user perspective.

Additional Ways to Analyze DataNike only made FuelBand information available to us. Many users, however, use the FuelBand and the mobile Nike Running App in conjunction to quantify their physical

activity. One of the original problems we set out to solve is being able to unify these separate applications and help users understand how their runs were connected to the FuelPoints they earned. Nike structures run data by taking 60-second snapshots during a user’s exercise and recording information such as FuelPoints and location. On the other hand, the FuelPoint data set featured total points earned points over 24-hour periods. We believed that these two data sets should be unified, so we experimented with over-laying paths of user runs on our application’s map canvas and encoding those runs with a color based on the number of Fuel points they earned. The result would juxtapose Fuel point heat maps of a location with the runs that occurred there.

We experimented with run paths by generating a set of random set of coordinates and assigning FuelPoints and a timestamp to each one. This process was done through a Python script and would simulate Nike’s “snapshot” model for recording runs. The result of random run generation on San Francisco is illustrated below -- blue colored lines involve runs that totaled higher fuel point totals, while yellow was the opposite.Our current implementation is primarily concerned with providing data metrics for Nike and helping Nike discover usage patterns in their user-base. This data, however, is packed with additional information such as social network information, user vitals, and more. Other applications could be designed to be more consumer centric. For example, a consumer version of our application could integrate a social layer and show users where their friends are exercising and how many fuel points they are earning. Such a tool could be used as motivational tool, as studies have shown that

5

Figure 7. Prototype of run path visualization. Generated from randomized data over San Francisco.

group workouts and accountability have been effective in helping people reach their workout goals.The Nike+ FuelBand is still in its nascent stages and its only just begun the process of helping users quantify their physical activity. Combined with user information such as vitals, living location, occupation, and social network activity, there can be thousands of possibilities for data analysis that can leverage the “fuel” concept for research, business, and building consumer applications. We presented one application, but in time, we hope that programs such as Nike’s recently launched Nike Accelerator and independent developers can use big data to tackle new problems.

CONCLUSIONWe present “Nike Plus Stanford,” a visualization tool for displaying aggregate fuel points acquired from a Nike+ FuelBand over time on a map. With this application, users can search for a region on a map and generate a heat map for the fuel points earned by Nike’s users. Concurrently, users can also view the number of points earned on a particular day in the queried region.

Current work in physical fitness visualizers only display data for a single user, and are not designed to synthesize data over large regions or for millions of users. Through a combination of database optimization techniques and server design, “Nike Plus Stanford” is capable of allowing Nike to explore their massive database and discover FuelBand usage patterns since its introduction in February 2012. We hope that the ability to quantify fitness in the form of big

data visualization can be used to improve physical fitness in the future.

ACKNOWLEDGMENTSWe thank CHI, PDC and CSCW volunteers, and all publications support and staff, who wrote and provided helpful comments on previous versions of this document. Some of the references cited in this paper are included for illustrative purposes only.

REFERENCES1. Felton, Nicolas, “Nicholas Felton’s Students Hack Nike

+ Data.” Fastco.Design 03 Jun 2011. Web <http://www.fastcodesign.com/1663977/infographic-of-the-day-nicholas-feltons-students-hack-nike-dat>

2. Friedman, Eric and James Park. “Fitbit” 2012. web <http://www.fitbit.com/>

3. Knight, Phil. “Nike+” Nike. 2012. Web. <http://nikeplus.nike.com/plus>4. Nadalutti, Daniele and Luca Chittaro, “Visual Analysis

of Users’ Performance Data in Fitness Activities, Computers & Graphics, Volume 31, Issue 3, June 2007 <http://www.sciencedirect.com/science/article/pii/S0097849307000635>

5. Noor Ali-Hasan, Diana Gavales, Andrew Peterson, and Matthew Raw. “Fitster: social fitness information visualizer.” 2006. In CHI '06 Extended Abstracts on Human Factors in Computing Systems (CHI EA '06). ACM, New York, NY

<http://doi.acm.org/10.1145/1125451.1125792>

6

http://www.fastcodesign.com/1663977/infographic-of-the-day-nicholas-feltons-students-hack-nike-data






http://www.sciencedirect.com/science/article/pii/S0097849307000635




http://www.fitbit.com/

http://www.fitbit.com/

http://nikeplus.nike.com/plus

http://nikeplus.nike.com/plus







Technology

Visualizing the Nike+ FuelBand Geographically