|
The scale-free nature of the web of sexual contacts.
[Download PDF file here - 372K]
Many ``real-world'' networks are clearly defined [1] while most ``social'' networks are
to some extent subjective [2,3]. Indeed, the accuracy of
empirically-determined social networks is a question of some concern because
individuals may have distinct perceptions of what constitutes a social link.
One unambiguous type of connection is sexual contact. Here we analyze data
on the sexual behavior of a random sample of individuals [4], and find that the cumulative
distributions of the number of sexual partners during the twelve months prior
to the survey decays as a power law with similar exponents |
|
|
Recent studies of real-world networks [1] have formalized mathematically the ``six-degrees of separation'' concept put forth in the classic study of Milgram [5]. This so-called small-world phenomenon [6,7] refers to the surprising fact that networks have small average path lengths between nodes while preserving a large degree of ``clustering'' [3]. Small-world networks may belong to three classes--single-scale, broad-scale, or scale-free--depending on their connectivity distribution P(k), where k is the number of links connecting to a node [8]. Scale-free networks--which are characterized by a power law decay of the cumulative distribution, P(k) ~ k-alpha-- may be formed due to preferential attachment, i.e., new links are established preferentially between nodes with high connectivities [9,10].
|
|
|
We analyze data gathered in a 1996 Swedish survey of sexual behavior [4]. The survey--involving a random sample of 4781 Swedish individuals (ages 18-74 yr)--used structured personal interviews and questionnaires to collect information. The response rate was 59 percent, corresponding to 2810 respondents. Two independent analyses of non-response error reveal that elderly people, and especially elderly women, are under-represented in the sample; apart from this skewness, the sample is representative in all demographic dimensions. Connections in the network of sexual contacts appear and disappear as sexual relations are initiated and terminated. To analyze the connectivity of this dynamic network, whose links may be quite short lived, we first analyze the number k of sex partners over a relatively short time window--the twelve months prior to the survey. Figure 1a shows the cumulative distribution P(k) for both female and male respondents. The data follow closely a straight line in a double-logarithmic plot, consistent with a power law dependence. The data shows that males report a larger number of sexual partners than do females [11], but that both have the same scaling properties. |
|
|
These results contrast with the exponential or Gaussian distributions--for which there is a well-defined scale--as was recently found for friendship networks [8]. Plausible mechanisms that could account for the observed structure include: (i) increased skill in getting new partners as the number of previous partners grows, (ii) different levels of attractiveness, (iii) the need to have many new partners to maintain self-image. Thus, the data are consistent with the preferential attachment mechanism. Perhaps, in sexual contact networks, as in other scale-free networks, `the rich do get richer' [9,10]. We next analyze the total number ktot of partners in the respondent's life up until the time of the survey. This quantity is not relevant to the ``instantaneous'' structure of the network but may help elucidate the mechanisms responsible for the distribution of number of partners. Figure 1b shows the cumulative distribution P(ktot). For values of (ktot > 20, the data follow a straight line in a double-logarithmic plot, consistent with a power law dependence in the tails of the distribution. Our major finding is the scale-free nature of the connectivity of an objectively defined, non-professional, social network. This result shows that the concept of the ``core group'' considered in epidemiological studies [12] is somewhat arbitrary as there is no well-defined threshold or boundary separating the core group from other individuals (as there would for a bimodal distribution). Our findings also have possible epidemiological implications. First, epidemics arise and propagate much faster in scale-free networks than in single-scale networks [6,13]. Second, measures to contain or stop the propagation of diseases in a network must be radically different for scale-free networks. Specifically, the study of scale-free networks indicates that they are resilient to random failure, but are highly susceptible to destruction of the best connected nodes [14], while single-scale networks are not susceptible to attack even of the best connected nodes. Hence, the possibility that the web of sexual contacts has a scale-free structure indicates that strategic targeting of safe-sex education campaigns to those individuals with a large number of partners may have a significant effect in reducing the propagation of sexually-transmitted diseases.
|
|
Discussion. This study grew of our interest in understanding the structure of social networks. A difficulty in studying generic social networks is that the links (the social connections) may be defined in a subjective way. Imagine, for example, that a researcher is trying to construct the friendship network of students at some school. This researcher's work would be very difficult because what one student would define a being a friendship, another would describe as being acquainted. Because of this subjectivity we tried to identify a type of social network were the connections would be very clear: "1 or 0", "there or not there". Sexual relationships clearly verify this rule: two people either had sex or they didn't. So the network is well-defined. Several different quantities have been shown recently to reveal important information about the structure of a network. Unfortunately, many of these quantities require full-knowledge of the network, i.e., of who is connected to whom. Since we do not have access to such information, the aspect of the network that we investigated in our study is the number of connections (in this case, number of sexual partners) that each individual has. Surprisingly, we found that the distribution has an unusual form: it is a power law function. What makes power law distributions interesting is that very large values--i.e., values much much larger than the mean--can be observed. Power law distributions are for this reason said to have no scale, or to be scale free. In contrast, bell-shaped distributions such as the one describing the height of humans are characterized by a single scale. It is the fact that we associate a scale to the height of humans that would lead us to reject as false any report of a human even twice as tall as the mean. For power law distributions values even 10 times larger than the mean can be observed for even quite small samples. For example, the mean number of partners since sexual initiation for women is approximately 7 and in a sample of less than 1500 Swedish women we find an individual with 100 partners. For men, the mean is approximately 15 and we find an individual with 800 partners, which is almost 50 times larger than the mean! So, somehow Don Juan was not such an extraordinary case but just one data point in a wide spectrum of behaviors that can be observed. Power law distributions also have the important feature that the few most connected individuals are responsible for a very large fraction of all connections. Specifically, in the case of the web of human sexual contacts, we find that the 10% most connected man have 48% of the sexual connections, while the 1% most connected have about 15% of connections. For women the numbers are 40% of connections for the 10% most connected and 10% for the 1% most connected. In contrast, for men, the 50% less connected are responsible for 12% of sexual contacts (about as many as the 1% most connected), and for women the 50% less connected are responsible for 15% of conctacts (also about as much as the 1% most connected). Our discovery also has consequences for the spread of sexually-transmitted diseases (especially for AIDS). Recent results suggest that epidemics can spread in scale-free networks even for diseases that have low infectability. So these diseases become essentially impossible to eradicate because they keep spreading. The reason why this happens is that a person with many partners act as a hub for the spread of the disease. If one has many partners, then there is a high probability of eventually contracting the disease. Moreover, once a person with many partners gets infected then it is also likely that it will eventually pass it to others. This fact explains why even though people at first thought that AIDS would only attack the so-called risk groups it then actually quickly spread to the general population. Note that researchers in epidemiology identified long ago so-called "core groups" that play a very important role in the spread of sexually transmitted diseases (STD). The idea being that, for example, prostitutes have many partners and hence will have a disproportionate effect on the spread of STD. However, the idea conveyed by the concept of a core group is that the population is divided into two sub-populations: the core-group (with many partners) and the other ("normal"?) people. Instead of confirming this "neat" separation, we find what can be called a "spectrum of behaviors" which is quantified by the power law distribution of number of partners that we report. Scale-free versus single-scale networks: An example of a single scale network is the chess board. Suppose you place a person in each square of a chess board and that each individual establishes connections--i.e., have sexual contacts-- with the persons standing to the left, right, back and front of them. Then, this picture defines a network in which every individual has exactly 4 partners. These 4 partners are the scale of the connectivity of the "chess-board" network. The actual web of human sexual comprises individuals with widely different number of connections. For example, the average number of partner since sexual initiation for males is 15 connections. Nonetheless, we register a male that had 800 different partners. That is 50 times larger than the mean! This fact is what makes the web of sexual contacts scale-free, one can observe connectivities that are much much larger than the sample's mean, i.e., our intuition fails to come up with a scale. |
|
More resources on complex networks.
|
|
Bibliography:
|
|
Acknowledgments. We thank G. Helmius for making the Swedish survey data availble to us. FL thanks STINT(97/1837) and HSFR(F0688/97), and CE thanks HSFR(F0624/1999) for support. LANA and HES thank NIH/NCRR (P41 RR13622) for support. |