Informatica 34 (2010) 409-417 409
Context-based Global Expertise in Recommendation Systems
Vincenza Carchiolo, Alessandro Longheu, Michele Malgeri and Giuseppe Mangioni
Dip. Ingegneria Informatica e delle Telecomunicazioni
Facoltà di Ingegneria - Università degli Studi di Catania - Catania - Italy
Keywords: expertise, recommendation systems, collaborative filtering, social networks Received: December 8, 2009
Expertise assessment is frequently required in social networks. This work proposes a method to globally rankpeople according to their expertise according to a set of topics We also introduce contexts similarity to allow related contexts to be exploited in expertise assessment. These ideas are applied to the Epinions.com recommender systems, showing that expertise in recommendation matters.
Povzetek: Opisana je metoda za rangiranje posameznikov glede na socialno omrežje.
1 Introduction and related work
The concept of expertise can be simply defined as the skill or knowledge that a person has in a particular field; more formally, expertise is "the ability to discriminate meaningful classes of domain features and patterns, and to take decisions or actions that are appropriate to the class at hand" [9].
Apart from definitions, less trivial matters are how the expertise can be evaluated, and which effective applications it can have. The expertise assessment can be hard, expe-cially in nowadays virtual social networks (e.g. Facebook, or even e-commerce oriented, as eBay), due to the the lack of real person to person interactions used in real world to judge someone's expertise level. Several approaches to this issue have been proposed [6, 12, 31, 32].
In particular, in [6] the authors aim at ranking the expert candidates in a given topic based on a data collection, hence they locate three components, i.e. a supporting document collection, a list of expert candidates, and a set of expertise topics. This work (as others) shows that an expertise rank is strictly related to a topic, so the question of which set of topics as well as their relationship (for instance, an arrangement into an ontology) should be addressed. That paper also highlights that expertise is commonly inferred from a set of documents (personal profiles, web pages, forum messages etc.) that represent the use case where expertise is applied, and they are the virtual counterpart of real world interactions between persons usually used to assess the each other's expertise. The evaluation of an expertise rank by exploiting some data is not a new idea[13], and it has been successfully applied in other scenarios, e.g. Usenet news messages [26] or computer supported cooperative work (CSCW) [16].
In [12], topic expertise is ascertained by exploiting collaborative tagging mechanisms that enable the formation of social networks around tags or topics. Authors state that inferring expertise from data as personal profiles is problematic since users should keep them updated, and they also
debate about the granularity in skill levels that should not be either too coarse or too fine (in the former case, automated systems have a difficult time selecting the right people, whereas in the latter users can hardly determine their levels in relation to others).
The work presented in [31] is a propagation-based approach for expertise assessment that takes into account both person local information and relationships between persons; this raises the question of local vs global approaches that is frequent whenever complex networks are considered, hence not only in expert finding scenarios but also others (e.g. trust, recommendation systems etc.)
In [32] the question of expertise within online communities is addressed, and network structure as well as algorithms are tailored to the case of Java Forum; this suggests that a fundamental role is played by the specific (possibily complex) network being considered and by its properties [18, 2].
In this work we present a method to rank people according to their expertise in a set of topics. We perform this assessment in an expertise network, i.e. where the relationship among nodes is the expertise rank assigned in a given context. In particular, we aim at evaluating the global expertise a node v has in a network within a specific context based on local expertise ranks assigned by v's neighbor nodes. The placement of our work in comparison with issues highlighted in works cited so far can be schematized as follows:
-	we infer expertise from data, in particular we exploit the Epinions [25] dataset, where people provide reviews about products and rating about reviews; these ratings are used (see section 3) to infer expertise about people that provided reviews
-	expertise is associated to a context (products categories), thus an user can be assigned with several expertise ratings, one for each context; moreover, products categories are arranged into a hierarchy (from general to specific category), hence we leverage this
410 Informática 34 (2010) 409-417
V. Carchiolo et al.
ontology to manage the granularity of expertise skill levels, the coarser granularity is needed, the more general categories are considered
- since users provide reviews for products in a specific category, we do not need to infer the association between an user and topic expertise (categories), hence we just focus on the expertise ratings assessment
- we aim at taking into account both global and local information, indeed we want to predict the review a specific user is likely to assign to an unknown product based on his neighbours' review about the same item (local information) weighting such reviews with global expertise ranks, as illustrated in sec. 4
The last point reveals the scenario where we intend to apply expertise, i.e. recommendation systems. The concept of expertise indeed is useful in several real applications, e.g. trust and reputation management [11], the assignment of task in an enterprise, or paper reviewers in a conference [12].
Recommender systems are "a specific type of information filtering technique that attempts to present information items that are likely of interest to the user". Such systems gained more and more attention since 1990s, when collaborative filtering approaches were developed [24], since the increasing amount of products available, the markets expansions due to e-commerce, and the diversity of customers also supported and enhanced by social networks, all endorse the need for effective recommender systems [1], a goal that not only relies on algorithms, but also imposes to consider several factors [15]. Recommender systems can adopt the content-based or collaborative filtering [3] approach; in the former case the system recommends an user about items similar to those he chose in the past, whereas in the latter the system suggests items chosen by similar users. The concept of similar users is often based on user profiles, however others ([19]) argue that trustworthiness among users might be also considered. We adopt the expertise as a discriminant factor when considering users' opinions (reviews) to get an effective recommendation; this approach is considered in several works, e.g. [12, 27]. Moreover, we also introduce the contexts similarity to allow related contexts to be exploited during ranks assessment, in order to provide an effective recommendation even when the experts' context is not exactly the same the recommended item falls within.
In the rest of paper, section 2 introduces the formalization of our approach, while in section 3 we apply the proposed model to the expertise network built from Epin-ions.com data set, defining and exploiting contexts similarity, showing results and application to recommender systems in section 4. Section 5 presents our conclusion and future works.
2 Expertise evaluation model
The expertise network we refer to in the following is modeled as G(V, L, lab), i.e. a labeled multi-digraph where the set of nodes V represent users1, L is a set of oriented edges, i.e. an arc (v, w) means that v assigned w at least an expertise rank in a context2, and the labeling function lab : L ^ {(C x [0,1])} associates to each arc (v, w) the set of pairs {(ci,ri)} being ri e [0,1] the expertise rank v assigned to w within context ci e C (C is the set of all contexts). Note that for a given arc, the rank is at most one for each context. In the following we indicate l% w as the rank r associated to the arc (v, w) within context c, assuming that lcvw = 0 for any context c when arc (v, w) does not exist, and Lc = [l% w] as the weighted adjacency matrix. A transition matrix Pc is then defined starting from Lc, as detailed later. Each element pVw of Pc represents a normalized expertise rank v assigned to w in context c.
To illustrate the mechanism we adopt to assign a global expertise context-specific rank we initially focus on two generic users v and w, evaluating how v can assign rank to w in a given context c. In the real world, if w is one of v's neighbours, it is reasonable to use pVw, otherwise v can ask to his neighbours whether they know w to get an opinion about him in c. In this case, if each of v's neighbours (denoted as j) directly knows w he can provide pcw, and it is reasonable that v weights these values with ranks he assigned to his neighbours within the same context, thus
having rlw = E j pVj • Pjw.
This one-step neighbours ranking can be written into matrix form as (rV= (Pc)T • pV, where rV and pcv are the vectors built from rcvw and pcVj respectively.
If neighbours j do not directly know the target w, v can further extend its requests to two, three, ..., k-steps neighbours, hence at step (k + 1) the ranking assessment is expressed by eq. (1). If Pc is neither reducible nor periodic [30], rV will converge to the same vector for every v, specifically to Pc eigenvector associated with the principal eigenvalue A1 = 1, leading to a global expertise rank for w in c.
" ' ' (1)
(rV)(k+i) = ((PC) r • (rV)
This approach is similar to the one proposed by EigenTrust [23] (where trust is replaced by expertise rank), and is frequently adopted also in other well-known works [20], [14]. They also offer a probabilistic interpretation of their method derived from the random walker graph model [17], a widely accepted mathematical formalization of a trajectory. We interpret the random walk as follow: if an agent v is searching for an expert within a given context c, he can move along the network choosing the further node w with probability pcVw e Pc; crawling with this method until a stable state is achieved, the agent is more likely to be at an expert node than at an unqualified node. From the random walker point of view, the first principal eigenvector
1 In the following, we will use the terms user and node interchangeably.
2in the following, we will use "context" or "topic" indifferently
CONTEXT-BASED GLOBAL EXPERTISE IN.
Informatica 34 (2010) 409-417 411
of Pc correspond to the standing probability distribution of the Markov chain defined by Pc, and network nodes are its states; thus, we define the expertise vector as the stationary point of the transformation given in (1) with non-negative components.
So far, the context c was fixed, but we also want to study contexts influence, i.e. even if the walker is biased by the context c, it can walk towards an user in a context similar to c, thus we have to choose how the walker moves along the network. To this purpose, we introduce two different walking models, described in the following, both taking into account that an user that has a large number of incoming links (within context c) is considered an expert within c, hence a walker that moves along a path connecting experts should enforce this quality.
Strong biased model
Given a topic (or context) c e C, a walker standing in a node v at step k moves to one of his outgoing neighbours w at step k + 1 if and only if l%w > 0 (i.e. one of v's neighbours is somehow expert within c).
According to the above definition, we define the transition matrix Pc as eq. (2), where outdeg(v)c is the number of arcs for which lcvw > 0 (hence J2w Pvw is always 0 or 1).
and w, there are paths from v to w of any length except for a finite set of lengths.
Both strong and smooth biased models do not work with dangling user and disconnected graphs; dangling users are those with no outgoing link that can be present in any real network. Moreover, in the strong biased case, users that have no outgoing links labelled by topic c also became dangling.
Among several solutions for dangling users that have been proposed [20, 5], we choose that a walker in a sink moves to any user according to a given probability distribution. We then define a new transition matrix (Pc) as (Pc) = Pc + S • aT, where a = (1/n,..., 1/n) and S = [S^ where Si = 1 if i is a dangling user and 0 otherwise; this guarantee that J2 w V%w = 1, ^v,w e V .The same trick is used to avoid users without ingoing links (that violates the aperiodic property), so achieving the following formula3:
(Pc) = q ■ (PC + (1 - q) ■ A, where A =(1,..., 1) ■ aT, q £ [0,1].
(4)
Thus from a non dangling user a walker follows one of the local outgoing links with probability q and jumps to some w e V with probability (1 - q); a common value for q is 0.05 [5].
{1 /outdeg(v)c if lVw > 0 0	otherwise
(2)
Smooth biased model
Given a topic c e C, a walker standing in a node v at step k moves to one of his outgoing neighbours w at step k +1 according to a probability distribution depending on similarity (relatedness) between c and all couples (ci,ri) labelling (v,w). The relatedness in smooth biased approach pcvw is defined accordingly to relatedness between topics pairs labelling the link (v, w) e L and c e C (eq. (3)). In particular given a topic c, we define the relatedness function as d : V x V x C ^ [0,1] (denoted as d(v,w,c)\(v,w) e L,c e C). Accordingly, the probability is defined in eq. (3)
pvw
= d(v, w, c)/ d(v,j, c)
(3)
Note that both strong and smooth biased models may lead to a transition matrix Pc where all elements of some rows and/or some columns are 0 (therefore Pc is not irreducible), and sometimes the associated graph might be disconnected.
To find stationary vector the transition matrix is required to be irreducible (equivalent to state that associated digraph is strongly connected [30]) and aperiodic: the first condition implies that exists a directed path from each node to any other, whereas the second implies that for any users v
3 Epinions.com: a case study
Epinions (http://www.epinions.com) is a recommendation system that "helps people make informed buying decisions" [25]. This goal is achieved through unbiased advice, personalized recommendations, and comparative shopping. Epinions allows registered users to rate products writing a review in order to provide visitors with opinions; a review can be represented as a numeric value plus a text comment about the product. Registered users could also rate the reviews, actually providing an expertise rank about other users. We use the Epinions dataset to validate our approach because it is a large and real dataset and although it is mainly a recommendation network, the reviews voting actually implements an author's reviews reputation mechanism based on products categories (i.e. authors are assigned an expertise rank within contexts).
We however still need to investigate the raw dataset about the assessment of (1) expertise and (2) contexts similarity.
To address the former issue, we consider an user w writing a review on a product (belonging to a given category, e.g. electronics), and another user v that can provide a rank to w's review, considering it useful or not; w can provide several reviews on products belonging to different categories, and v can rate all of them. Based on such information, we then build the arc (v,w) and label it with a set of pairs {(ci, ri)}, where we associate each context to exactly one products category, and the expertise rank with the
3in literature the formula is referred as teleportation vector
P
412 Informática 34 (2010) 409-417
V. Carchiolo et al.
rate v provided about w's review for the product belonging to that category; note that in the case w reviewed more products belonging to the same category, we evaluate the normalized average rate provided by v over all these products, so that ri is within the [0,1] range. Of course, we discard all users that did not provide any review.
Another issue is to define a metric to evaluate context similarity, (e.g. TVs category is intuitively much more related to electronics than wellness and beauty); this is needed by the random walker to exploit different yet related contexts. This semantic distance is a function we name sim(ch,ck) g [0,1] where 0 means no similarity and 1 means that contexts ch and ck are identical terms. Measuring the semantic distance between terms (contexts) has been extensively considered in literature (Vector Space Model [22], Resnik [21], Lesk similarity [4]). Since Epinions provides a hierarchical arrangement of contexts, e.g. electronics includes sub-contexts as cameras & accessories, Home audio, we can exploit this to provide a simple semantic distance evaluation. In particular, to find the semantic distance between ci and c2 we search for "the concept c3 which generalizes c1 and c2 with type T3 such that T3 is the most specific type which subsumes T1 and T2; the semantic distance between c1 and c2 is the sum of the distances from c1 to c3 and c2 to c3 ".
This metric is described in [10, 8] and it satisfies reflexiv-ity, symmetry and triangle inequality properties. Moreover topics types are always the same, therefore our metric can be stated as the "sum of the distance between two concepts and first common ancestor" (along the hierarchical classification provided by Epinions). Finally, we normalize the semantic distance between contexts in order to have values into the [0,1] range as shown in eq. (5), where max_dist is the length of the longest path between contexts and a ■ b means that a is an ancestor of b in the Epinion hierarchy.
Dataset extracted from www.epinions.com
sim(ci, Cj) =
min(d(ci, Ck) + d(cj,Ck)) max dist
VCk < Ci, Cj
(5)
d(v, w, c) =
1
if it,,,, > o
J2k sim(C,Ck) * rk/Y,k rk otherwise
# nodes		37 321
# sink (out-degree =	0)	1 538
# source (in-degree:	= 0)	0
# link		460 504
# totanl n. of topics		791
average in-degree average out-degree average topics per node
12.13 8.81 17.71
Table 1: Characteristics of the dataset extracted from Epin-ions website
for strong and smooth biased models, comparing them with an unbiased case (i.e. context independent) defined as follows:
pv,
{1/outdeg(v) o
if outdeg(v) > 0 otherwise
(7)
Therefore, the similarity function defined in eq. (5) is used in smooth biased approach to calculate the relatedness between users (see eq. (6)).
(6)
Table 3 shows the characteristics of the dataset extracted from Epinions website we used in our first set of experiments.
3.1 Results
The expertise network built from Epinions dataset is used to validate the proposed expertise rank assessment model, in particular we evaluate the stationary point of transformation of the transition matrix in eq. (4) using Pc as defined
In the unbiased case (eq. (7)), the transition probability from a node v to a node w is independent from the context c hence the steady state probability vector depends only on the structure of the expertise network, i.e. the more links a node has, the more often it will be visited by a random walker. This also means that using the unbiased random walker model an user that has a low number of links will receive a low expertise rank value, even if he is the only one labelled as expert on a given topic c.
In real life expertise is always assigned within a given context c and our idea is to capture this behaviour using a random walker biased by context, as explained in the previous sections. In order to validate the strong and smooth biased random walker models presented in section 2, we will show that the probability of reaching nodes with expertise identical or similar to the target c grows with respect to the unbiased case.
In the following we report the results of a set of experiments performed using the network we extracted from Epinions. For each experiment we set a specific topic c and we evaluate the expertise vector for the unbiased random walker and for both the strong and the smooth biased random walker models. Therefore, for each topic ci we sum all the expertise ranks (or steady state probability) of those users labelled with ci obtaining the so-called cumulative expertise of topic ci. It corresponds to the steady state probability that a random walker visits a node belonging to the topic ci. For the sake of simplicity, in the following all the Epinions' topics are indicated by a number instead of their names.
Figures 1,2,3,4 show the comparison of percentage increment of biased models cumulative expertise with respect to unbiased (eq. (7)) considering a common and a rare topic with respect to the unbiased case. In particular, we focused on topic #30 which is very common (i.e. 14995 links associated to it over 460504 total, 14995 source nodes over 35783 and 5134 targets over 26000), and on topic #536
CONTEXT-BASED GLOBAL EXPERTISE IN.
Informatica 34 (2010) 409-417 413
■ #30	#322	#426
		
100	200	300	400	500	600	700
Figure 1: Strong biased by #30
Figure 3: Strong biased by #536

Figure 2: Smooth biased by #30
which is quite rare (5 links, 5 sources and 1 target, respectively).
Results highlight that cumulative expertise on c always grows with respect to the unbiased case. Let us note that when expertise is biased by the common topic #30, the cumulative expertise related to some other topics (namely #322 and #426) also increase, whereas when the rare topic #536 is used only nodes labelled with such a topic are affected by biasing (figs. 1,2). The fact that biasing on topic #30 also affects the cumulative expertise of other topics is mainly due to the structure of Epinions network, indeed being topic #30 very common means that a large amount of nodes are somehow expert in that topic. Some of these nodes are also expert in topics #322 and #426 and a certain number of them have a high network degree, so there is a high probability that they are involved in most of paths followed by the random walker, hence the side effect of cumulative expertise increasing for topics #322 and #426 occurs.
Also note that expertise in smooth biased model increases much more for both rare and common topics with respect to the strong biased model, confirming the advantage in exploiting similarity between topics during expertise rank assessment (see fig. 3,4).
Another experiment focuses on Epinions' users, showing their expertise in the rare topic #536, where just one target node w is considered expert by just five other nodes.
Figure 4: Smooth biased by #536
Figure 5 highlights each user's expertise on topic #536, evaluated using unbiased, strong and smooth biased models. In particular, we focus on users #3442 and #577, where the former is the only user labelled as expert in topic #536. Expertise evaluated in unbiased and strong-biased case slightly differs for all nodes but #3442 as expected. Indeed the unbiased case for node #3442 shows an expertise value that is nearly zero due to the low number of links such a node has. This confirms that our biased models are able to capture the expertise of a user on a given topic even if the topic is rare and also the node has few links with respect the average nodes degree in the network. The diagram also shows that user #577's expertise for unbiased case is the same as strong biased case since it has no inlinks labelled with the topic #536.
The comparison of the smooth biased case with others is more interesting, indeed:
1.	node #3442's expertise increases much more than the corresponding strong biased model
2.	node #577's expertise increases also!
Item 1 is the expected behavior and confirms our hypothesis that the expertise of a node depends on the opinions of his/her neighbours. Item 2 instead puts in evidence
the influence of highly connected nodes on the expertise evaluation. Specifically, node #577 is much more connected than the average node's connectivity having an out-
414 Informática 34 (2010) 409-417
V. Carchiolo et al.
Figure 5: User's expertise assessment
degree of 1333 (versus a network average of 8.81), and in-degree of 241 (versus a network average of 12.13). This means that a random walker tends to visit such a node more frequently than the other nodes of the network, since it is included in many paths. In conclusion, the increasing of expertise not only trivially depends on the expertise in a given topic, but it is also affected by the structure of the network, i.e. the presence of hubs that can be somehow considered expert for (almost) any topic.
4 Application of expertise
The assessment of expertise within a social network described so far allows to globally establish how much a user is expert and in which topic; this is useful for several applications (as described in sec. 1), in particular we want to exploit expertise in recommender systems to predict the rating an user will assign to a given item (belonging to a specific category) based on other users reviews' on the same item, mediating such reviews with the reviewers expertise.
We first introduce a measure of the utility an user receives by a specific product, then we consider the Epinion social network again as the application scenario, showing how to predict product ratings using the global expertise ratings.
4.1 Getting Utility
In economics, the utility is a measure of the relative satisfaction deriving from the consumption of goods or services [7], hence we define the utility function u(o) that allows us to rank the user's preferences on consumed products, that is u(oi) > u(o2) means that user strictly prefers oi instead of o2. Based on the expertise assessment presented in previous sections we define the utility u(o) as follows:
u(o)
E
ieR°
\Ro\
(8)
where Ro is the set of nodes that provided a review about product o, ri is the global expertise rank associated to node i according to the eq. (1) and tO is the numeric score i assigned to o . Note that the real effectiveness of u(o) strictly depends on the number of existing rewiews, indeed the more reviews of distinct users are available, the more affordable u(o) will be.
It is reasonable however that u(o) actually depends on the specific user being considered, i.e. it can happen that u(o1) > u(o2) for an user v but not for w, for instance due to personal preferences or depending on the categories oi and o2 belong to. Based on this consideration, we introduce the utility function for a generic user v as uv (o) : I ^ [0,1], where I is the set of items (products). The uv (o) is simply defined as tVO if v rated o and zero otherwise.
In addition to the utilities functions, we also consider the expected utilities, used to predict how much an user likes an unknown product; in recommender systems users usually leverage the others' experience to predict such values. We then introduce the function e(o) : I ^ [0,1] (and similarly ev (o)) as the expected preference on product o (personal expected preference of v on o, respectively); in the following we exploit the global expertise ranks evaluated in (1) and the expertise network G defined in sec. 2 to evaluate both e(o) and ev (o). For the sake of clarity, we initially take into account only a topic, thus we label each arc of G(V,, L, Lab) with t instead of {ri, ci}.
If we want to neglect personal user preferences, i.e. we focus on e(o), it is reasonable that we use the u(o) defined in eq. 8 as the prediction value, so we assume that e(o) — u(o). As stated previously however, a better contribution is given by ev (o), thus we define ev(o) as follows:
>(o) =
i
iGN°
\N\
(9)
where NO is the set of v's neighbours that rated the product o. The formula can be extended by considering the (local) expertise rank each neighbour is given by v, in order to properly weight their contribution, thus we define:
Jocal
(o)
i
ieN° rvi
e
global (O) =
E
i
ieN »
E
ieN»
ev (o)= S ■ e° +(1 - S) ■ e:
global V
(10)
where rvi is the (local) expertise rating v assigned to his neighbour i, ri is the global expertise rank about i evaluated in eq. (1) and S allows to balance local and global contributions. Note that we avoid the use of product to calculate expected value for a resource since it always gives a resulting value lower than the highest factors; this is due to the term related with expertise, which is always less than 1 (in some of the example in this paper it ranges over [3 • 10~6,5 • 10~3] with an average value of 2 • 10~5).
o
r . r
o
r
V
r
o
rr
V
r
o
rr
CONTEXT-BASED GLOBAL EXPERTISE IN.
Informatica 34 (2010) 409-417 415
4.2 Products rating prediction
To apply utility functions defined previously in the Epinion dataset to predict product ratings by others experience and expertise, we divided the dataset into a testing dataset and a training dataset, according to insertion date, i.e. reviews and nodes belong to testing or training dataset according to the date they has been stored into Epinions. Currently our dataset contains about 433 000 reviews starting from Jan, 17 2001 up to 4/1/2009, the chosen training set contains all reviews up to Dec, 31 2007 i.e. about 381 000 entries that represents about 88% of total entries, the related ratings on reviews are about 11 800 000 that are 90% of total entries. Once created the dataset, we evaluate the global expertise using only the training dataset, then we begin to add new review according with the date they were actually written, evaluating the expected function (10). Finally the expected function is compared with the real value the user assigned to the resource, in order to assess the effectiveness of such approach.
This comparison is performed using the Mean absolute error (MAE), used in statistics [29] to measure how close predictions are to outcomes; MAE is defined as follows:
MAE :
Er=i \fi - Vi\ _ EHi \ei\
(11)
Table 2: mean absolute error
biasing	approach	category	mae
smooth	1	30	0.934
smooth	2	30	0.630
smooth	3	30	0.591
smooth	4	30 (Ö = 0.8)	0.592
strong	1	30	0.821
strong	2	30	0.571
strong	3	30	0.539
strong	4	30 (Ö = 0.8)	0.595
smooth	1	17	0.833
smooth	2	17	0.583
smooth	3	17	0.548
smooth	4	17 (Ö = 0.8)	0.595
where ti is the absolute error between the prediction fi and the true value yi, and n is the number of samples.
The simulation takes into account all Epinions users that wrote at least one review. Starting from jan 1,2008 we look for all reviews and compare the expected value to the real one. Results are summarized in table 2, where the fi used as the expected function is as follows:
1.	expected value is calculated using all existing reviews starting from jan 1, 2008, i.e. the prediction of (11) is the u(o) defined in eq. (8); this is indicated as approach 1 in table 2
2.	expected value is calculated using only users that belong to the neighbours of review's author. This is the approach 2 in table 2 and as fi we used the eq. (9)
3.	expected value is calculated using only users that belong to neighbours of review's author as in previous case (eq. (9) is used), though in this case just neighbours with some expertise in the category o belongs to are considered (instead of considering all neighbours). This approach is named as approach 3 in table 2.
4.	finally, in the approach 4 we used the eq. (10)
The experiments shown good results in all cases because the MAE is always less than 1 while rating ranges over the set {1, 2,3,4,5}. This means that if an expected rate is 3 for instance, the real rate is in the worst case 2 or 4, which can be considered acceptable since it is not a complete turnover of the expressed opinion.
However different strategies highlight that base weighted average (i.e. approach 1) always performs worst
than the approches that also exploits local information, in-fact approches 2,3 and 4 select the user to ask for their reference among own neighbourood. The best results are about 0.5 that highlight that error is quite marginal. Unfortunately using only own neighbourhood limits the number of resources (products) that can be ranked, therefore in a real system both approaches should be used
5 Conclusion
In this work we introduced the expertise as a global property of a node and we performed an assessment on the Epinions dataset. Epinions.com. Expertise has been defined using a biased random walker model and its corresponding probabilistic interpretation and has been applied to a dataset extracted from Epinions website, where a mechanism of expertise evaluation based on products review has been introduced, together with a similarity function used to exploit topic similarity for a better global expertise rank assessment. Results confirmed that the expertise can be considered a network property that depends on network structure and direct (local) users experience.
Moreover, we applied the global expertise ranks in rec-ommender systems to predict the rating an user would assign to a given item based on other users reviews' on the same item, mediating such reviews with the reviewers expertise. The prediction aimed at integrating both local and global contributions, and simulations shown the effectiveness of such an approach.
Some questions still remains to be addressed:
-	we have to investigate on different similarity functions (e.g. when more, different ontologies are present)
-	we worked supposing that all information about products ratings and expertise were available, but users may also somehow hide their personal preferences [28]
-	in [32] the question of expertise within online communities is addressed, and network structure as well as
n
n
417 Informática 34 (2010) 409-417
V. Carchiolo et al.
algorithms are tailored to the case of Java Forum; this suggests that a fundamental role is played by the specific (possibily complex) network being considered and by its properties [18, 2]. hence we have to test the proposed approach in other networks, since network structure may have a significant impact on both the effectiveness and the efficiency of the approach
References
[1]	Gediminas Adomavicius and Alexander Tuzhilin. Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions. IEEE Trans. on Knowl. and Data Eng., 17(6):734-749, 2005.
[2]	Reka Albert and Albert-Laszlo Barabasi. Statistical mechanics of complex networks. Reviews of Modern Physics, 74:47, 2002.
[3]	Marko Balabanovic and Yoav Shoham. Fab: Content-based, collaborative recommendation. Communications of the ACM, 40:66-72, 1997.
[4]	Satanjeev Banerjee and Ted Pedersen. An adapted lesk algorithm for word sense disambiguation using wordnet. Computational Linguistics and Intelligent Text Processing, pages 117-171, 2002.
[5]	Pavel Berkhin. A survey on pagerank computing. Internet Mathematics, 2:73-120, 2005.
[6]	Hui Fang and ChengXiang Zhai. Probabilistic models for expert finding. In ECIR, pages 418-430, 2007.
[7]	Peter Fishburn. Utility Theory for Decision Making. Robert E. Krieger Publishing Co., 1970.
[8]	Norman Foo, Brian J. Garner, Anand Rao, and Eric Tsui. Semantic distance in conceptual graphs. Ellis Horwood, Upper Saddle River, NJ, USA, 1992.
[9]	Jared Freeman, Webb Stacy, Jean Macmillan, and Georgiy Levchuk. Capturing and building expertise in virtual worlds. In FAC '09: Proceedings of the 5th International Conference on Foundations ofAugmented Cognition. Neuroergonomics and Operational Neuroscience, pages 148-154, Berlin, Heidelberg, 2009. Springer-Verlag.
[10]	B.J. Garner, D. Lukose, and E. Tsui. Parsing natural language through pattern correlation and modification. In Proc. of the 7th International Workshop on Expert Systems & Their Applications, pages 12851299, Avignon, France, 1987.
[11]	T. Grandison and M. Sloman. A survey of trust in internet application. IEEE Communication Surveys and Tutorials, 4(4):2-16, 2000.
[12]	A. John and D. Seligmann. Collaborative tagging and expertise in the enterprise. Proc WWW 2006, 2006.
[13]	Henry Kautz, Bart Selman, and Mehul Shah. Re-ferralweb: Combining social networks and collaborative filtering. Communications of the ACM, 40:63-65, 1997.
[14]	Jon M. Kleinberg. Authoritative sources in a hyper-linked environment. Journal of ACM, 46(5):604-632, 1999.
[15]	Francisco J. Martin. Top 10 lessons learned developing, deploying, and operating real-world recom-mender systems. In Proceedings of 3rd ACM Conference on Recommender Systems, 2009.
[16]	D.W. McDonald and M. S. Ackerman. Expertise Recommender: A flexible recommendation system and architecture. In Proc. Int. Conf. on CSCW, 2000.
[17]	Rajeev Motwani and Prabhakar Raghavan. Randomized algorithms. Cambridge University Press, New York, NY, USA, 1995.
[18]	M. E. J. Newman. The structure and function of complex networks. SIAMReview, 45:167, 2003.
[19]	John O'Donovan and Barry Smyth. Trust in recommender systems. In IUI '05: Proceedings of the 10th international conference on Intelligent user interfaces, pages 167-174, New York, NY, USA, 2005. ACM.
[20]	Lawrence Page, Sergey Brin, Rajeev Motwani, and Terry Winograd. The pagerank citation ranking: Bringing order to the web. Technical report, Stanford InfoLab, 1999.
[21]	Philip Resnik. Using information content to evaluate semantic similarity in a taxonomy. In IJCAI, pages 448-453, 1995.
[22]	G. Salton, A. Wong, and C. S. Yang. A vector space model for automatic indexing. Communications of ACM, 18(11):613-620, November 1975.
[23]	Hector Garcia-Molina Sepandar D. Kamvar, Mario T. Schlosser. The eigentrust algorithm for reputation management in P2P networks. In proceedings of the Twelfth International World Wide Web Conference, 2003., 2003.
[24]	Upendra Shardanand and Patti Maes. Social information filtering: Algorithms for automating "word of mouth". In Proceedings of ACM CHI'95 Conference on Human Factors in Computing Systems, volume 1, pages 210-217, 1995.
[25]	Shopping.com Network. Epinions.com ©, http://www.epinion.com, 1999-2010.
CONTEXT-BASED GLOBAL EXPERTISE IN.
Informatica 34 (2010) 409-417 15
[26]	Loren Terveen, Will Hill, Brian Amento, David McDonald, and Josh Creter. Phoaks: a system for sharing recommendations. Communications of ACM, 40(3):59-62, 1997.
[27]	Loren Terveen and David W. McDonald. Social matching: A framework and research agenda. ACM Trans. Comput.-Hum. Interact., 12(3):401-434, 2005.
[28]	Frank E. Walter, Stefano Battiston, and Frank Schweitzer. A model of a trust-based recommendation system on a social network. Journal of autonomous agents and multi-agent systems, 16:57, 2008.
[29]	Frank Edward Walter, Stefano Battiston, and Frank Schweitzer. Personalised and dynamic trust in social networks. In Lawrence D. Bergman, Alexander Tuzhilin, Robin D. Burke, Alexander Felfernig, and Lars Schmidt-Thieme, editors, RecSys, pages 197204. ACM, 2009.
[30]	Wolfram	MathWorld. http://mathworld.wolfram.com/periodicmatrix.html -http://mathworld.wolfram.com/reduciblematrix.html, 1999-2010.
[31]	Jing Zhang, Jie Tang, and Juanzi Li. Expert Finding in a Social Network. LNCS, 2008.
[32]	Jun Zhang, Mark S. Ackerman, and Lada Adamic. Expertise networks in online communities: structure and algorithms. In WWW '07: Proceedings of the 16th international conference on World Wide Web, pages 221-230, New York, NY, USA, 2007. ACM.