Upload
ijirst
View
212
Download
0
Embed Size (px)
DESCRIPTION
Cyber Physical Systems are primarily focused to interact with other systems across the data collection. Information collection depends on the field of knowledge discovery. This contains information either from field based or collaborative filtering. In a domain recommendation model, filtering data based on the rating pattern across the domains. The basic functionality of the cross domain recommendation model is to seek and retrieve the information with reduced the repeated records. A domain recommendation model, which are identifying the common routing pattern across specific domains and determining the specific domain rating patterns in each domain containing similar behavior and relation which exists among the fields. For the clustering information, system utilizing a different data set with an additional amount of accumulating data. These data sets are compared and filtered with items and are clustered with domains. The clustering process reduces the uncompleted data sets and offers more meaningful data. Domain recommendation model mainly integrating the knowledge discovery and information transport as part of the information transfer organization. The major type of systems is offered retrievals of items through the single field. The cross domain recommendation provides data by integrating multiple areas with identical details.
Citation preview
IJIRST –International Journal for Innovative Research in Science & Technology| Volume 2 | Issue 02 | July 2015 ISSN (online): 2349-6010
All rights reserved by www.ijirst.org 202
A New Dimensional Knowledge Discovery for
Routing Network using Latent Feature Model
Mr. Emil George Ms. Silja Varghese
Department of Computer Science and Engineering Department of Computer Science and Engineering
Nehru College of Engineering and Research Center,
Pampady, Thrissur, Kerala, India
Nehru College of Engineering and Research Center, Pampady,
Thrissur, Kerala, India
Abstract
Cyber Physical Systems are primarily focused to interact with other systems across the data collection. Information collection
depends on the field of knowledge discovery. This contains information either from field based or collaborative filtering. In a
domain recommendation model, filtering data based on the rating pattern across the domains. The basic functionality of the cross
domain recommendation model is to seek and retrieve the information with reduced the repeated records. A domain
recommendation model, which are identifying the common routing pattern across specific domains and determining the specific
domain rating patterns in each domain containing similar behavior and relation which exists among the fields. For the clustering
information, system utilizing a different data set with an additional amount of accumulating data. These data sets are compared
and filtered with items and are clustered with domains. The clustering process reduces the uncompleted data sets and offers more
meaningful data. Domain recommendation model mainly integrating the knowledge discovery and information transport as part
of the information transfer organization. The major type of systems is offered retrievals of items through the single field. The
cross domain recommendation provides data by integrating multiple areas with identical details.
Keywords: Cyber Physical System, Cross Domain Recommendation, Nonnegative Matrix Factorization, Knowledge
Discovery and Information Retrieval
_______________________________________________________________________________________________________
I. INTRODUCTION
Now a day the information handling and sharing are the most abundant in the field of e-commerce. This will produce numbers of
drawbacks to get the right information from suitable sources. To integrate with information overload recommended system
provides more beneficial options. Recommender systems are the best research field used to pile up more suitable information.
Major component of these organizations belongs to a single field. But in that respect user knowledge acquired in single area
could be clustered and collaborated in various other domains with similar settings. The cross domain recommendation model is
not only limiting with information sharing, but also provide, distributed recommendations of similar related user ratings
getting from different areas, e.g. not focusing just a moving picture [2], but also music CDs, books somehow related to that
picture. Existing work also puts up a joint recommendation on multiple areas, simply choose the user recommendation only
based on the privileges and this sharing is determined in some arena of pursuit. The major critical part of these surveys is to
recover out the existing correlation of particular preferences with users in different arenas, getting the fact to relating item factors
with user interests, design the right models to integrate and make these findings, and develop a fantastic structure to encompass
whole of these evaluations. Cross domain recommendation may tend to be higher user recommendations for information and to
be more precise than single domain.
Finally, there are schemes that concentrate on joining combined recommendations for different areas to establish a single
knowledge base system with standardized features. Whenever using the web site, the system will record user‟s feedback and
utilized it as ratings. Established on such ratings, recommended system predicts personalized information to notified items. That
is suitable method used to carve up this information and sorted with suitable approaches. Only in some situation users are not
willing to supply the useful feedback due to the lack of involvement. For the instance, such as user buying an electronic device,
they may not accept a plan to purchase new single lately. It may cause data sparsity problems. The inefficient or unreliable
response from the user causes a major limitation for domain related models. For solving the prescribed type of problems, the
cross domain recommendation model utilizes multiple domain with similar referencing features. Due to the sharing of
information across multiple fields, cross domain recommendation model collects a vast quantity of data. Most models take over
that the collected information will be dense for all types of users with items [4]. An augmented matrix can be recreated by
horizontally merging all clustered matrices with similar feedback. The existing studies in the social networks focus the
relationship across the two different areas. These interrelations generated different impact on the society and make the new
combinations with different similar fields. For example Bioinformatics formed by combining biology with computer science.
Medical information processing uses the aggregated result of data mining in medicine, which provides efficient effect on medical
scientific discipline. The cross domain recommendation can be useful in all the fields for combining similar things with different
fields. These combined solutions will supply more efficient effect by combining useful information from both fields. The cross
A New Dimensional Knowledge Discovery for Routing Network using Latent Feature Model (IJIRST/ Volume 2 / Issue 02/ 035)
All rights reserved by www.ijirst.org 203
domain recommendation is mainly working with knowledge prediction manner. Thus, studying collaboration and achieving to
the result become hard. The main challenging factor is to recover out the figures of the cross domains. Cross–domain clustering
is usually not applied in the traditional approach with identical domain [3], partly because this attack is more difficult to obtain,
the more suitable relationship in the same region which one does not familiar. Thus, this approach makes challenging one to
applying at once without having proper training samples and information. When compared with the data mining, clustering helps
the researchers to predict the region, which they require to analyze the work and find the identical feedbacks. The cross domain
area required some higher degree of anticipation.
II. RELATED WORKS
In the area of medical analysis, medical analyzer who wants to apply data prediction techniques to detect the deceases, it
will be difficult for them to predict suitable relationship across data mining. Cross domain may bring about difficulties in
finding suitable combinations of terminology in two dissimilar areas. Most of the domain cluster models are founded on
the single field. In this example, the ground substance is prepared with user-items from the auxiliary matrix. This is same
as the mannequin used for the data collection in the self- taught learning model. The self-taught learning model collects the
abundant quantity of data from the auxiliary matrix and represents the test data as the linear combination basis. Cross
domain recommendation forms two sided data as a matrix. Two sided Matrix keeps items and users as the constituents of
the rows and columns. Instead of array in the self-taught learning model, Cross domain model uses matrix form. The code
book model is founded on the manikin-based collaborative filtering and flexible mixture model. Model based collaborative
model is likewise founded on the two sided clustering and is defined equally P(r, i, j) = P(i)P(j)P(r|ui, vj) ø (ui, vj),
Where ui denotes the user cluster to which ith
user belongs, vj the item cluster to which the jth
item belongs and ø the cluster
association parameter. Flexible mixture model is the probabilistic model defined as
P(r, i, j) =∑ ( | ) ( | ) ( | ) ( ) ( )
Where u is denotes the user cluster, v for item cluster, and r for ratings. The flexible mixture model is fully probabilistic
model, whereas the cross domain model is based on clustering item with users. Collaborative filtering have been done on
reducing the sparsity and the similarity enhancement [3] and focuses to get the identical domain knowledge via user-item
clustering, association rule and item based reasoning [5] are smoothest the auxiliary matrix by setting out the average rating
values within user cluster [6] rating patterns by simultaneously clustering the items and users. After clustering the items and
users combine these solutions based on user and item based search solutions. These models block to find out the required
domain. Nevertheless, these examples are fully based on the corresponding domain only. P. B. Li [3] referred a system which
attempts to combine social networks for collaborative filtering. This example worked on content-based filtering and
collaboratively filtering. Different ranking systems which are practiced to learn the heterogeneous content from the social
networks using machine learning methods and combines regression. Yuan et al. [8] aimed to fuse heterogeneous social
relationships for recommendation using factorization and regularization technologies. Different algorithms are recommending
combining the similarities of online social community users by collaborating topics and filtering by probabilistic domain
modeling. Yet, most recommended works referring collaborative filtering with single domain only, and does not look at the
sparsity problem. These recommended models are providing data with regard to specific areas on respective subjects. Dual
transfer learning exploits the duality by matrix tri-factorization, which proposes to solve the problem generated in clusters and
sorting of information.
III. PROBLEM DEFINITION
The existing model generates the latent feature structure by sharing common rating across the domains, which collects the
items contained in the user-item clusters. But the rating patterns from the multiple domains rarely contain the similar ratings.
At the same item limited numbers of domains are closely related to target domains of interest than the others. All the relatedness
among the multiple domains cannot be collected with the identical routing patterns, which are not exact value of all domains.
Domain cluster model is designed based on the item cluster factor and user cluster factor from the available rating data. Based on
the subspace identification, the model can learn the user item rating pattern in common cluster level is shared across the domains.
The model can simultaneously collect domain specific rating pattern by studying knowledge from different domain. When
considering the book-rating and movie-rating web ports, the books and movies can have similar topics or clusters on their basic
meaning (e.g., the categories of classical or comedy), but these similarities may not get in all situations about award winning
movies cannot help to find out the clustering of books on the topics of award history. Inspired by the situation, the
recommendation model focuses knowledge based assumptions to establish relations across the domains. These predictions relate
the user ratings with item characteristics and similarities between the domains. Data collected from the multiple data sources
have heterogeneous data types and each one is categorized as “domain”. This problem setting is based on the assumption that
some data types in the auxiliary domains can be obtained more easily than the data type in the target domain. Normally, this
A New Dimensional Knowledge Discovery for Routing Network using Latent Feature Model (IJIRST/ Volume 2 / Issue 02/ 035)
All rights reserved by www.ijirst.org 204
problem setting requires the user/item sets in different data domains to be the same. Then the knowledge can be discovered and
transferred by finding relationships between data domains.
IV. PROPOSED TECHNIQUE
As the domain clustering models consists of two domains. One is the domain which is used to process the data and next is
modeled with the co-clustering items. A latent feature model is used for clustering items as well as users.
1) Definition 1:
Non Parametric Latent feature models. The idea behind latent feature model is to decompose data into small components based
on the domain and clustering each item/user based on the similarities. Table - 1
User and Item profiles in books and movies domains
Attribute Value
(Book Domain) Rating
Attribute Value
(Movie Domain) Rating
Toy Story Cartoon 4.2 White Balloon Cartoon 1.0
Golden Eye Fantasy Novel 3.5 Apollo Adventure 0.8
Get Shorty Adventure 5.0 Crumb Carrie Fisher 1.0
When clustering the users and items, the algorithm is working based on the routing information as well as timely information
available from the user data. Therefore the algorithm provides more accurate results on co-clustering items, users with respect to
time. The Table. 1 shows the comparison of two different domains which have similar characteristics and behavior. Each item is
classified with the ratings assigned to it. The items are related to users based on the user identifiers. The attributes of the book
domain clustered with item type cartoon and the compute rating based similarities between items or users. Let us consider the
example shown in Table 1, comparing two domain attributes from book and movies based on the similar category “Cartoon”.
Then compute the similarities between same items provided by same users or ratings of similar items provided by similar target
users.
Fig. 1: The shows the user-item values against two rating matrices. The unknown data is represented by „?‟. The parameters u1 and u2 are
representing the user rating values and v1 and v2 are representing the corresponding rating for items.
When considering the latent feature model, information with the most similarities associated with each domain can
consider for collecting the data. For example, a “Data Mining” information find out in the grouping with most
similarities will be considered for the keywords “clustering”, “sorting”, “classification”, etc.
Latent Feature Relational Model A.
Latent feature relational model is a method proposed for filtering the user/item ratings. The latent feature model allowing
similarity models for clustering a range of datasets. Multiple latent features can be active in an observed data set. Different
methods are used for Dimensionality reduction such as factor analysis and probabilistic matrix factorization provides a statistical
approach to triggering the latent features. These methods provide a small set of dimensions and model each data set as weighted
combinations. Dimensionality methods can improve predictions on an observed data. When considering the user/item rating
patterns, this model allows the dependencies between data points. The dependencies exist between the observations may be
spatial or temporal. The dependencies are due to the sharing of latent features between the nearby data points. The main goal of
latent feature model is to improve the performance of the generative models by representing latent structures of individual
entities from the observed data in a proper manner.
Model Description B.
The data set is formed as a matrices Z with an infinite number of columns. The rows of Z correspond to the users and
the columns correspond to the items. The analysis of data, the user for pointing, and the cells represent the corresponding ratings
against the user/items. The data sets are filled by selecting the matched rating patterns of similar domains. The rating distribution
A New Dimensional Knowledge Discovery for Routing Network using Latent Feature Model (IJIRST/ Volume 2 / Issue 02/ 035)
All rights reserved by www.ijirst.org 205
matrix consists of a finite number of rows and columns. If the observed ratings from the two matrices are more similar, the
resulting matrix will be more likely to share the same ratings.
V. DATA SETS
The following real-world data sets are used for the experiments.
1) Movie Lens data sets: This data set contains more than 100000 user options with rating from 1 to 5 by 2875 users on
11542 movies. Creating matrix with grouping data and randomly choosing 300 users by 300 movies with top most
ratings and unknown columns are filled with average ratings.
2) Book-Crossing data sets: A book rating data sets contain 21500 user ratings rank from 1 to 7 by 560 users on 25000
books. The ratings are limited to 1 to 5 for the suitably comparison with movie lens datasets. Select a sub-matrix
randomly with 200 users and 500 books with most ratings.
VI. EXPERIMENT RESULT AND DISCUSSION
Cross Domain User-Item clustering A.
A useful method to detect the issues is to define cross domain graph connecting user‟s source and auxiliary domains and
uses the clustering algorithm to rank the relations in the auxiliary domain. User-item similarities provide the better choice
for the multi domain connections through the subset of user- item pairs between the domains. In this method, not all
collections are considered. One challenge is how to get the relation against the source domain to auxiliary domain. For this
clustering algorithm is used.
Rating pattern sharing across domains B.
The Rating-pattern sharing was proposed in Code Book Transfer (CBT), for solving adaptive transfer learning (domain
adaptation) problems in CF. Then the idea was incorporated into a probabilistic model, Rating-Matrix Generative Model
for solving collective transfer learning (multi-task learning) problems in CF. Both CBT and multi-task learning are cross-
domain CF methods over system domains. Rating patterns are transferred across the related metrics. Since user-item
ratings are matrix data with two finite sets of objects (user and item). These objects are co-cluster simultaneously and find
similarity among different rating matrices by matching group level rating pattern. We can thus simultaneously group users
based on their ratings on items and group items based on their associated ratings provided by users in both domains to find
shared group-level rating patterns.
Domain Clustering for Collaborative Filtering C.
In this work, the user-item matrix X is clustered by using orthogonal nonnegative tri-factorization algorithm. The result
will produce the produce the new user-item co -clustering. Traditional methods require comparing whole database, apart
from this, clustering method can filter the data based on the clustering results. Therefore clustering technique can reduce
the search space. Table – 2
Comparison with the results reported Mean Absolute Error (MAE) of the compared models on Each Movie and Book-Crossing related domains
Algorithms Given 5 Given 10 Given 20
NMF (Non Negative Matrix Factorization) 0.980 0.942 0.879
CBT (Code Book Transfer) 0.954 0.922 0.839
RMGM (Rating Matrix Generative Model) 0.944 0.933 0.915
CBCF (Cluster Based Collaborative Filtering) 0.924 0.896 0.890
CLFM (Cluster-Level Latent Feature model) 0.933 0.909 0.88
Table 2. Shows the comparative study of models with results reported Mean Absolute Error of the compared models on Each
Movie and Book-Crossing related domains under different configurations.
The evaluation criteria are formed using different parameters. For each user there are three different categories (Given 5,
Given 10, and Given 20) of observations are given for comparing different algorithms. The Fig. 3 shows the comparison
algorithm of different models. The MAE value getting by using the RMGM model is showing more performance than the
CBT and NMF.
A New Dimensional Knowledge Discovery for Routing Network using Latent Feature Model (IJIRST/ Volume 2 / Issue 02/ 035)
All rights reserved by www.ijirst.org 206
Fig. 2: MAE performance of the compared models with respect to the value of shared subspace dimensionality in EachMovie domain.
The evaluation result of CLFM is more effective than any other model using for clustering the user ratings and it reduces
the sparsity by sharing useful information across the systems. The smallest value in the graph having better performance.
VII. CONCLUSION
This paper is mainly focused in the area of cluster level latent feature on the multiple domains. The clustering techniques and
domain factors are categorized via orthogonal nonnegative matrix tri-factorization using cluster level latent feature model, which
can transfer useful knowledge from the auxiliary domain to target areas. The knowledge is transferred in the form of code book.
The experiments have evaluated the performance of rating and ranking prediction in terms of various metrics using different
models and other comparative methods. This model is fully based on the clustering of user/item ratings using cluster level
features. For future work it is planned to systematically study the co-clustering scenario of multiple domains and form new
clustering methods with differentiating agents.
REFERENCES
[1] Sheng Gao, Zhanyu Ma, Member, IEEE, and Patrick Gallinari, “A Cross Domain Recommendation model for Cyber Physical System, ” IEEE
Transactions on Emerging Topic in Computing vol. 1, No. 2, January 2014.
[2] P. B. Li, “Cross-domain collaborative filtering: A brief survey,” in Proc. 23rd IEEE ICTAI, Nov. 2011, pp. 1085_1086. [3] Jie Tang, Sen Wu, Jimeng Sun, and Hang Su, “Cross-domain Collaboration Recommendation,” KDD‟12, August 12–16, 2012.
[4] Gang Chen. Fei Wang, “Collaborative Filtering using Orthogonal Nonnegative Matrix Tri-factorization,” Seventh IEEE International Conference on Data
Mining. [5] Abel, F., Herder, E., Houben, G.J., Henze, N., Krause, D.: Cross-system user modelling and personalization on the social web. User Modelling and User-
Adapted Interaction, 1–42 (2011) [6] Siting Ren, and Sheng Gao, “Improving Cross-domain recommendation through Probabilistic Cluster Level Latent Factor Model” cs.IR 24 sept 2014.
[7] Bin Li, Xingquan Zhu, Senior Member, IEEE, Ruijiang Li, and Chengqi Zhang, Senior Member, IEEE, “Raing Knowledge sharing in Cross-domain
Collaborative filtering” IEEE Transactions on Cybernetics 2014. [8] Q. Yuan, L. Chen, and S. Zhao, “Factorization vs. regularization: fusing heterogeneous social relationships in a top - and recommendation.,” In RecSys‟11,
pages 245–252, 2011.