9781424472659/10/$26.002010 IEEE
Social Similarity as a Driver for Selﬁsh,Cooperative and Altruistic Behavior
Eva Jaho, Merkouris Karaliopoulos and Ioannis Stavrakakis
Department of Informatics and TelecommunicationsNational & Kapodistrian University of AthensIlissia, 157 84 Athens, GreeceEmail: {ejaho, mkaralio, ioannis}@di.uoa.gr
Abstract
—This paper explores how the degree of similaritywithin a social group can be exploited in order to dictate thebehavior of the individual nodes, so as to best accommodatethe typically noncoinciding individual and social beneﬁt maximization. More speciﬁcally, this paper investigates the impact of social similarity on the effectiveness of content dissemination,as implemented through three classes representing well thespectrum of behaviorshaped content storage strategies: theselﬁsh, the selfaware cooperative and the optimally altruisticones. This study shows that when the social group is tight (highdegree of similarity), the optimally altruistic behavior yields thebest performance for
both
the entire group (by deﬁnition) andthe individual nodes (contrary to typical expectations). When thegroup is made up of foreigners with almost no similarity, altruismor cooperation cannot bring much beneﬁts to
either the group or the individuals
and thus, a selﬁsh behavior would make sensedue to its simplicity. Finally, the selfaware cooperative behaviorcould be adopted as an easy to implement distributed scheme– compared to the optimally altruistic one – that has close tothe optimal performance for tight social groups, and has theadditional advantage of not allowing mistreatment to any node(i.e., the content retrieval cost become larger compared to thecost of the selﬁsh strategy).
I. I
NTRODUCTION
Today’s networks can be highly personalized, in the sensethat their structure and usage are shaped by the personalinterests and behavior in general of the participating nodes.Nodes in such networks – referred to as social networks –are typically well connected, develop reciprocal trust relations,and have some common features, such as the content they areinterested in and the places they tend to visit. Groups of suchnodes are called social groups [11].In this paper, we consider a group of (networked) nodeswith common interests in content – more generally: objects;in computer science objects of interest are usually informationobjects, such as ﬁles and software. It is assumed that nodesof this social group store objects in their limited local storageto retrieve them when desired at minimum cost. If the nodesdo not possess a desired object, they can fetch it either froma node in the group at some lowmedium cost or – if notavailable in the group – from a node outside the group athigher cost. The lowmedium cost associated with fetching
This work has been supported by the ISTFET project SOCIALNETS (FP7IST217141).
an object from within the group may reﬂect actual or virtualprice, access delay due to the locality of the fetching processor level of connectivity, and level of trust and cooperation.As the local storage is assumed to be limited when compared with the plethora of objects possibly desired, an inherently selﬁsh node would tend to store locally objects of higher personal interest. Our past work in [8] has shownthat this is not the best content placement strategy for anode in a distributed group with the three levels of contentaccess cost considered here. Instead, a cooperative contentplacement strategy has been devised based on gametheoreticarguments. The cooperative strategy determines which objectseach node should store locally, so that the total content accesscost for each and every node is no more than (and typicallymuch lower than) that induced under the selﬁsh strategy. Thelatter property implies that the content placement strategy is
mistreatmentfree
: no node will lose by participating in thegroup, compared to acting selﬁshly. Hereafter we will refer tothis placement strategy as the selfaware cooperative strategy,due to its mistreatmentfree property and cooperative nature.Mistreatmentfree strategies are key to the sustainabilityof such distributed selﬁsh groups, as they motivate users toparticipate in the group and share objects with others. Thesocial beneﬁt (i.e., the average beneﬁt over all nodes) inducedby the selfaware cooperative strategy is not optimal. Thestrategy that implements a content assignment that maximizesthe social beneﬁt will be referred to hereafter as the optimallyaltruistic strategy; this (optimal) content assignment can bederived by solving an optimization problem (as done, forinstance, in [9]). The implementation of the optimally altruisticstrategy would require the exchange of richer informationamong the nodes in the group (local demand distributions),whereas the selfaware cooperative strategy requires the exchange of some limited information among the nodes in thegroup (indices of content stored locally). Finally, note that tomaximize the total beneﬁt, some of the nodes may end upgaining too much and others being mistreated.
Focus of this paper
: It is, therefore, evident that a nodeparticipating in a distributed group may face a dilemma asto which strategy, to follow. In this paper the characteristicsof the social group are exploited in order to help addressthe above dilemma. We propose an innovative approach to
characterizing the similarity of the social group nodes withrespect to content interests and introduce a group
tightness
metric. The dependence of the induced social and individualnode beneﬁts on the level of
tightness
of the social groupis clearly established for all three aforementioned contentplacement strategies, which reﬂect general patterns of socialbehavior. Our results let us draw important conclusions andguidelines for the content placement strategy a node shouldadopt for a given level of
tightness
in the social group.
Related work
:The exploitation of social characteristics fordata dissemination in autonomic and opportunistic networkshas been considered from various aspects, in the literature.In [1], the authors construct a dynamic learning algorithmwhere nodes from various social communities opt for a utilitymaximizing content placement strategy based on their encounters with other nodes. [14] studies the impact of different"levels of altruism" of nodes involved in the disseminationprocess. In a more abstract setting, the effect that a node’srelational position in the group has on content disseminationhas been considered in [4]. The importance of designingsociallyaware opportunistic networks is also demonstrated in[2], [5], [7]. Finally, for a review of data dissemination in the
general context of opportunistic networks, readers are referredto [3].This study applies to social networks with interactionsbetween computer devices having limited memory resources.These are typically encountered in mobile opportunistic networks that are additionally “socially aware”, meaning thateither the nodes or their human users are aware of theformation of social groups and the potential beneﬁts fromparticipation in such a group.In Section II we formulate the problem and introduce thetightness metric capturing the degree of similarity of interestswithin a social group. In Section III the three strategies forcontent placement are brieﬂy described. These strategies arecompared to eachother in section IV, under different
tightness
values, with respect to the induced content retrieval cost atboth the individual node and entire group level. Finally, wesummarize the major conclusions of the paper in Section Vand point to interesting problems for future work.II. P
ROBLEM
F
ORMULATION
A
ND
T
IGHTNESS
M
ETRIC
We assume that there are
N
nodes in a social group andeach node has its own probability distribution of interest in
M
information objects (preferences). Let
M
=
{
1
,
2
,...,M
}
bethe set of objects,
N
=
{
1
,
2
,...,N
}
be the set of nodes and
F
n
be the interest distribution of node
n
over the objects.
F
nm
can also be viewed as the request rate of node
n
, (
n
= 1
,...,N
)for object
m
, (
m
= 1
,...,M
). All objects are assumed to beunitsized. Node
n
has a storage capacity of
C
n
units.Let
P
n
denote the
placement
of node
n
, deﬁned to be the setof objects stored locally at this node. Without loss of generality, we take

P
n

=
C
n
since a node can always gain by savingobjects of interest locally in its storage than having to retrievethem from a distant source. Let
P
=
{
P
1
,P
2
,...,P
N
}
denotethe global placement for the social group and
P
−
n
=
P
\
P
n
the set that contains the placements of all nodes in the groupexcept for node
n
.Assume that the cost for accessing an object from a node’slocal storage is
t
l
, from another remote node in the group
t
r
and from another node in another group
t
s
, with
t
l
< t
r
< t
s
.These values are assumed to be the same for all nodes in orderto simplify the analysis. In reality, the three cost types may bedifferent for each node; yet the costs for different node pairswithin a social group are expected to be similar.Given an object placement
P
, the mean access cost per unittime for node
n
is given by:
C
n
(
P
) =
m
∈
P
n
F
nm
t
l
+
m/
∈
P n,m
∈
P
−
n
F
nm
t
r
+
m/
∈
P n,m/
∈
P
−
n
F
nm
t
s
.
(1)The ﬁrst summation corresponds to the mean cost of accessingobjects locally; the second term refers to the mean cost of accessing them from nodes within the social group; and thethird sum accounts for the mean cost of accessing objects notstored anywhere in the group, i.e., from a node external to thegroup.To deﬁne
tightness
as a measure of similarity between thenodes’ preferences for objects, we used the Kullback Leibler(KL) divergence [6], a wellknown metric capturing thedivergence between two distributions. The KullbackLeiblerdivergence of distribution
Q
from
S
is deﬁned as:
D
S,Q
=
i
S
(
i
)
logS
(
i
)
Q
(
i
)
.
Besides being always nonnegative, KL has some desirableproperties in the considered context of the paper that favorit over other wellknown (dis)similarity metrics, such as theKolmogorovSmirnov distance Spearman’s rank correlationcoefﬁcient, proportional similarity, and total variation distance[10], [13]. As opposed to the KolmogorovSmirnov distancewhich takes the supremum of the differences over all elementsof a distribution, in the KL divergence all differences contribute to the calculation. Thus, the number of preferencesthat two nodes have into account is also implicitly considered.Spearman’s rank correlation coefﬁcient describes the degreeof association in the ranking of the distribution elements;hence, it would not provide us with insights when the rankingof interest in objects is the same between two nodes butthe actual distribution values are different. Finally, the KLdivergence permits our metric of tightness to take a broaderrange of values compared to the proportional similarity or totalvariation distance. Our comparative evaluations suggest thatKL is more sensitive to changes of distribution values, andthus can more accurately depict differences in interest proﬁles.KL is not a measure of distance since
D
S,Q
=
D
Q,S
. Tocome up with one, we invoke the symmetrized divergence,which is deﬁned as:
D
S,Q
=
D
Q,S
=
D
S,Q
+
D
Q,S
.
Hence, we can deﬁne
D
F
i
,F
j
as the distance of preferencedistributions of nodes
i
and
j
(to be referred to hereafter2
as the preference distance between
i
and
j
). Notice that thepreference distance is always nonnegative,
D
F
i
,F
j
≥
0
[6].The average preference distance of the group is then deﬁnedas the average of the pairwise preference distances, computedover all
N
(
N
−
1)
/
2
node pairs in the group:
ˆ
D
F
=
(
i,j
)
D
F
i
,F
j
N
(
N
−
1)
/
2
,
Finally, we deﬁne
tightness
T
to be the inverse of the averagepreference distance of the group:
T
= 1ˆ
D
F
.
(2)
Tightness
expresses the similarity of interests among nodesof the social group and is always greater or equal to zero.
T
→∞
when the interests of nodes for the objects coincide,whereas
T
→
0
implies that the nodes have completelydifferent preferences. As
T
is an average metric of interest“closeness” among the nodes of a social group, it is clear thata given value of
T
may arise under different sets of interestdistributions of the nodes. In order to draw more insightfulconclusions in the current study, we consider the followingtwo cases of dissimilarity in the nodes’ interest distributions:
•
Case 1: The order (rank) of the objects remains thesame for all nodes (i.e., the ﬁrstranked object for allnodes is the same, the secondranked object is the same,etc). However, the interest distributions are different:they become more concentrated around the most popularobjects as the node index
n
increases.
•
Case 2: The interest distributions are identical for allnodes but the ranking of a given object may change fordifferent nodes (e.g., nodes do not necessarily have thesame object as their
k
thranked one).In the numerical examples in this paper
N
= 5
nodes (tobetter illustrate the results), the capacity
C
= 10
objects, theobject population is
M
= 50
objects and
t
l
= 0
,
t
r
= 10
and
t
s
= 20
cost units. All preference distributions in the testcases that follow are Zipf distributions. The Zipf distributionhas been shown to be a good model for the popularity of webobjects [12], which could also constitute the bulk of trafﬁc inour scenario. Moreover, it is remarkably ﬂexible in capturinga wide range of distributions, from the uniform (for skewnessparameter
s
= 0
) to much more heavytail distributions withhigher skewness values (
s >
0
).
A. Case 1
The request rates
F
nm
of node
n
over the objects
m
=1
,
2
,...,M
are drawn from a Zipf distribution with differentexponent
s
for each node. We consider that Node
1
hasuniform interest distribution, i.e.,
s
= 0
for this node. Then,for node
n
,
n
= 2
,
3
...,N
,
s
is increased by
p
(
n
−
1)
, where
p
∈
R
is the increment parameter. For example, when
p
= 0
.
2
,
s
= 0
.
2
for Node
2
,
s
= 0
.
4
for Node
3
, and so on. The requestrate of node
n
for object
m
is given by:
F
nm
=
f
(
m
;
s,M
) = 1
/m
s
M l
=1
1
/l
s
.
(3)
Table IE
XAMPLE TIGHTNESS VALUES
(a) T when increased by different values of
p
Increment parameter (p) Tightness (T)0.0
∞
0.2 2.08610.4 0.46140.6 0.23980.8 0.16971.0 0.1362
(b) T when shifted by different values of
k
Shift parameter (k) Tightness (T)0
∞
1 0.36882 0.26743 0.22944 0.20895 0.19626 0.187610 0.0012
A preference ranking of the
M
objects is determined by thisdistribution – which are accordingly ranked as
[1
,
2
,...,M
]
– and is the same for all nodes. As
s
increases, a node’sdistribution becomes more concentrated in the ﬁrst objects.Table I(a) shows the value of
tightness
when the interestdistributions are derived as outlined above, for different valuesof the increment parameter
p
. Notice that
tightness
decreasesas
p
increases. This is because as
p
increases, the pairwisedistances between any two node distributions increase (thedifference in their
s
parameter is higher).
B. Case 2
The request rates
F
nm
are drawn from a Zipf distributionwith exponent
s
with
s
= 1
for all nodes, which is given by(3). If the rank of objects is the same for all nodes, we derivefrom (2) that
T
→∞
.In order to establish dissimilarity in the nodes’ interests, wecreate a different rank of objects for each node. To this end,Node
1
is assigned the rank of objects
[1
,
2
...,M
]
and this rank is shifted to the left by different positions for each of the othernodes. We consider different values of the shift parameter. Ingeneral, the rank of objects of node
n
,
n
= 1
,...,N
is shiftedby
k
(
n
−
1)
positions, where
k
∈
N
is the shift parameter. Forexample, when
k
= 1
, the ranking of Node
1
is
[1
,
2
,...,M
]
,the ranking of Node
2
is
[2
,
3
,...,M,
1]
, the ranking of Node
3
is
[3
,
4
,...,M,
1
,
2]
, and so on. Table I(b) shows the value of
tightness
when the interest distributions are shifted by variouspositions, as described above. Notice that
tightness
decreasesas the shift parameter increases. This is because the averageabsolute difference of the distributions (for the same object)between any two nodes increases, as
k
increases. Generally,numerical values of
tightness
are close to zero, and increaseabruptly as distributions become more similar.III. C
ONTENT
P
LACEMENT
S
TRATEGIES
In this section we describe the object placement strategiesthat we consider in this paper. Under the
Optimally altruistic
3
strategy the objects are stored in such a way that the totalaccess cost for all nodes in the social group is minimized (i.e.,minimize
N n
=1
C
n
(
P
)
). This problem can be transformedinto a 01 integer programming problem.Let
X
nm
=
1
,
if
m
∈
P
n
;
0
,
otherwise and
Y
nm
=
1
,
if
m /
∈
P
n
and
m
∈
P
−
n
;
0
,
otherwise.The objective is to minimize the function of the total accesscost:
N
n
=1
M
m
=1
X
nm
F
nm
t
l
+
Y
nm
F
nm
t
r
+ (1
−
X
nm
)(1
−
Y
nm
)
F
nm
t
s
,
where
Y
nm
= (1
−
X
nm
)(1
−
N
j
=1
j
=
n
(1
−
X
jm
))
.
This is a quadratic programming problem, whose solutionis very difﬁcult. A reformulation of the above to a linearproblem is derived in [9].Under the
Selﬁsh
strategy, or
greedy local
strategy asreferred to [8], the nodes only store their most preferableobjects. Each node
n
ranks the objects in a decreasing order
[1
,
2
,...,M
]
such that
F
n
1
≥
F
n
2
≥
...
≥
F
nM
and selects tostore the ﬁrst
C
n
ones. Thus,
P
n
= [1
,
2
,...,C
n
]
.Finally, under the
Selfaware cooperative
strategy each nodeﬁrst stores its
C
n
most preferable objects and then makesreplacements based on the placements of the other nodesin order to improve its placement [8]. Thus, a node maydecide to evict an object that exists in some other node inthe group in order to insert a new object, if this incurs anaccess cost reduction in (1). As the nodes play sequentially,each replacement made by a node may negatively affect theaccess cost of other nodes. It is proved in [8] that thisstrategy is mistreatmentfree, i.e., for any node
n
, it holdsthat
C
C n
(
P
)
≤ C
S n
(
P
)
, where
C
C n
(
P
)
denotes the access costof node
n
for all the objects under the selfaware cooperativestrategy and
C
S n
(
P
)
denotes its access cost under the selﬁshone. Thus, the ﬁnal access cost of a node is at most as highas its access cost under the selﬁsh strategy.It was shown in [8] that only objects not already includedin the group can be inserted during a replacement step,evicting only objects which are present elsewhere in thegroup (duplicates). Due to these properties (and as the resultsfollowed have shown) the performance of the selfawarecooperative strategy can be very close to the optimally (socialcost minimizing) altruistic one. In Section IV we explore underwhich conditions this performance is achieved.On a more practical note, an optimally altruistic behaviorrequires complete knowledge of the group’s characteristics(demand patterns of all nodes in the group) and a solutionto the global optimization problem. This can be solved bya central authority that dictates its placement decisions tothe nodes, or in a distributed fashion, in which each nodesolves the global optimization problem and then all nodesnegotiate their placements (there are multiple solutions tothe global optimization problem). The selfaware cooperativestrategy requires less information;more implementationdetailscan be found in [8]. The selﬁsh strategy has the smallestimplementation cost, since each node does not need to beaware of the group’s interests.IV. N
UMERICAL
E
VALUATION
In this section we present some numerical examples toillustrate the impact of
tightness
on the access cost underthe three behaviorbased content placement strategies. Theconclusions drawn from these results can help establish clearguidelines as to which strategy would be beneﬁcial to theindividual nodes and/or the entire group, and under whichtightness conditions in the group.Fig. 1 and 2 show the individual node and total (i.e., for theentire group) access cost under the different content placementstrategies and for different values of
tightness
. Both scenariosfor the interest distribution dissimilarity are considered (Case
1
and Case
2
, Section II).
A. Social groups with inﬁnite or very high
tightnessThe results show that the optimally altruistic strategy is thebest performing one regarding both the individual cost for anynode (Fig. 1(a)), as well as the cost for the entire group (Fig. 2,
T
=
∞
), for both Cases
1
and
2
. Consequently, the optimallyaltruistic behavior is the clear winnerbehavior for any nodein a very tight social group.Notice that under very high
tightness
, the individual andtotal access cost induced by the selfaware cooperative strategyis (a) very close to (slightly higher than) that under the optimally altruistic and (b) is always lower than that under the selfish strategy. In other words, the selfaware cooperative strategyinduces no node mistreatment while yielding performanceclose to the optimal. Thus, given its lower implementationcomplexity compared to the optimally altruistic (see SectionIII), it may be selected as an easier to implement and similarlyperforming alternative to the optimally altruistic strategy invery tight social groups.To this end, the larger the
tightness
of the social group, thegreater the group’s beneﬁts when nodes are either cooperativeor optimally altruistic, compared to being selﬁsh.
B. Social groups with very low
tightnessWhile it was shown that under very high
tightness
boththe individual nodes and the entire group will beneﬁt byhaving the nodes adopt the optimally altruistic or the selfaware cooperative strategies (compared to the selﬁsh one), Fig.1(d) and 2 (for
T
0
) show that this is not the case underlow or very low
tightness
of the social group.Although the optimally altruistic strategy (always) bringsthe maximum beneﬁts for the group, it mistreats individualnodes under such group tightness conditions. (e.g., Node 5 inFig. 1(d), for both Cases
1
and
2
). Furthermore, the beneﬁts tothe group are about the same as under the selﬁsh strategy or4
0 5 10 15 2012345
A c c e s s c o s t
Node idT=
∞
SelfishSelfaware cooperativeOptimally altruistic 0 5 10 15 2012345
A c c e s s c o s t
Node idT=
∞
SelfishSelfaware cooperativeOptimally altruistic
(a)
0 5 10 15 2012345
A c c e s s c o s t
Node idT=2.09SelfishSelfaware cooperativeOptimally altruistic 0 5 10 15 2012345
A c c e s s c o s t
Node idT=0.27SelfishSelfaware cooperativeOptimally altruistic
(b)
0 5 10 15 2012345
A c c e s s c o s t
Node idT=0.24SelfishSelfaware cooperativeOptimally altruistic 0 5 10 15 2012345
A c c e s s c o s t
Node idT=0.21SelfishSelfaware cooperativeOptimally altruistic
(c)
0 5 10 15 2012345
A c c e s s c o s t
Node idT=0.14SelfishSelfaware cooperativeOptimally altruistic 0 5 10 15 2012345
A c c e s s c o s t
Node idT=0.19SelfishSelfaware cooperativeOptimally altruistic
(d)Figure 1. Individual access cost under different strategies for different values of
tightness
T
, under Case
1
(ﬁgures on the left) and Case
2
(ﬁgures on theright)
5