View
250
Download
2
Category
Tags:
Preview:
Citation preview
Predicting The Next App That You Are
Going To Use
1
Ricardo Baeza-YatesDi JiangFabrizio SilvestriBeverly Harrison
The Idea
2
Yahoo Aviate Dataset: Events Distribution
3
Is App Prediction Easy?
4
Why Frequency is Not the best Signal?
5
Timeslots
# apps opened in timeslot i
# times a is opened in timeslot i
Classification Based Approach: Features
6
Basic Features Session Features
Time Last App Open
Latitude Last Location Update
Longitude Last Charge Cable
Speed Last Audio Cable
GPS Accuracy Last Context Trigger
Context Trigger Last Context Pulled
Context Pulled
Charge Cable
Audio Cable
Classification Based Approach: +1’s and -1’s
7
+1
-1’s
Session Features via Word2Vec
8
Open Skype
Location changed to XXX YYY
Opening Mail
Charge cable plugged
123 3245
543 56232
544 56830
42 32113
High cosinesimilarity
Classification Based Approach: Models Tested
• Naïve Bayes
• SVM
• C4.5
• Tree Augmented Naïve Bayes
• Softmax
9
Experiments: The Dataset
10
• Yahoo Aviate log data– From October 2013 to April 2014
• 480 active users chosen u.a.r.
• 80/20 train/test split– training done on a sliding window of 12 hours.
Experiments Results: Dominant Apps Filtered
11
90.2%
State-of-the-art methods attains up to ∼20% of
precision for the same task.
App Cold Start
• When a new app is installed we do not have any signal for it:– In particular, P(aui), the probability that u opens app i, is unknown.
– P(fi|pa(fi)), the prior probability for a given feature, can be instead obtained from other users.
12
Short-Term vs. Long-Term Apps• We fit app usage data into a Beta(α,𝛽) variable and we compute the excess kurtosis
a positive value indicates a short-term app while a negative value is likely to indicate a long-term one.
13
P(aui) Estimation Using Users’ Data
14
• Short-term apps:– P(aui) = P(ai)
– After a fixed amount of time (e.g. 2 hours) we start using P(aui) from user’s history.
• Long-term apps:– (Bayesian Average), where
– is the number of times app ai has been opened by u
– is the total number of apps opened by u, and
– is the average no. of times ai has been opened, in general.
– C is the weight we give to the “other users” component.
App Cold Start Experiments• For each app install, we process the original dataset by splitting it into
two subsets:– one subset containing all the events referred to the period before the app install one
subset containing the remaining events. Newly installed apps in the dataset 1.42%
• Short-term apps– Not using P(aui) estimation: 86.3%; using P(aui) estimation 87.1%
– New apps precision: 91.3%; old apps precision 86.25%.
• Long-term apps
– The general method attains 89.3% of precision but no newly installed app are correctly predicted.
– Using our method we increase the precision up to 90.3%: New apps precision: 91.1%; old apps precision 89.26%.
15
User Cold Start
• Most similar user: naïvely select the user with the most similar app inventory– Pros: easy to implement, high coverage
– Cons: the most similar user might be, in fact, very different from the user considered.
• Pseudo user: minimum set cover–min over the sum of inverse similarity
– Pros: the pseudo user is designed to by very similar from the user considered.
– Cons: NP-Hard (log n approximation exists)
16
User Cold Start Experiments
• Most Similar User– Average precision of 32.7%
– “Scarcity” of similar users: Jaccard similarity between two different app inventories has an average value of 0.121465 (±0.038955) and a median of 0.117647
– Even when similarity is high accuracy increase is not satisfactory.
• Pseudo User– Average precision is 45.7%
17
Conclusions
• Presented a (scalable) personalized app prediction methodology achieving up to 90.2% precision.
• Dealt with two cold-start problems:– On app cold start we achieve precision results up to 87.1% (short-
term) and 90.3% (long-term)
– On user cold start we achieve up to 45.7% of precision in the first day the user is using the homescreen launcher.
18
Open Problems
• The biggest open problem is to improve the effectiveness of prediction at cold start, in particular user cold start.
• Is it possible to extend the technique used here to top-k personalized app recommendation?
19
20
Recommended