<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>Naman's Weblog on Analytics</title>
	<atom:link href="http://b000b000b000.wordpress.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://b000b000b000.wordpress.com</link>
	<description>Just another WordPress.com weblog</description>
	<lastBuildDate>Thu, 03 Jul 2008 19:58:37 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
<cloud domain='b000b000b000.wordpress.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://s2.wp.com/i/buttonw-com.png</url>
		<title>Naman's Weblog on Analytics</title>
		<link>http://b000b000b000.wordpress.com</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://b000b000b000.wordpress.com/osd.xml" title="Naman&#039;s Weblog on Analytics" />
	<atom:link rel='hub' href='http://b000b000b000.wordpress.com/?pushpress=hub'/>
		<item>
		<title>Planetary-Scale Views on a Large Instant-Messaging Network</title>
		<link>http://b000b000b000.wordpress.com/2008/07/03/66-is-the-new-6/</link>
		<comments>http://b000b000b000.wordpress.com/2008/07/03/66-is-the-new-6/#comments</comments>
		<pubDate>Thu, 03 Jul 2008 19:41:26 +0000</pubDate>
		<dc:creator>a000a000a000</dc:creator>
				<category><![CDATA[Six Degrees of Seperation]]></category>
		<category><![CDATA[2 degree seperation]]></category>
		<category><![CDATA[analytics]]></category>
		<category><![CDATA[bhilwada]]></category>
		<category><![CDATA[copy paste research]]></category>
		<category><![CDATA[gitesh tiwari]]></category>
		<category><![CDATA[god knows what technologies]]></category>
		<category><![CDATA[gurdaspur]]></category>
		<category><![CDATA[kota rajasthan]]></category>
		<category><![CDATA[malana]]></category>
		<category><![CDATA[naman chakraborty]]></category>
		<category><![CDATA[namanchakraborty]]></category>
		<category><![CDATA[online research]]></category>
		<category><![CDATA[online white paper]]></category>
		<category><![CDATA[prize wining reports]]></category>
		<category><![CDATA[roman catholic]]></category>
		<category><![CDATA[sharanpur]]></category>
		<category><![CDATA[six degree seperation]]></category>
		<category><![CDATA[uttarakhand]]></category>

		<guid isPermaLink="false">http://b000b000b000.wordpress.com/?p=6</guid>
		<description><![CDATA[In a research paper from June 2007, titled "Worldwide Buzz: Planetary-Scale Views on an Instant-Messaging Network (PDF)," Eric Horvitz of Microsoft Research and Jure Leskovec of Carnegie Mellon University analyzed 30 billion conversations among 240 million people using Microsoft Instant Messenger in June 2006. It turned out that the average path length, or degree of separation, among the anonymized users probed was 6.6. (Credit: Wikipedia) Six degrees of separation posits that a person is a step away from people they know and two steps distant from people known by the people they know--thus the magic number six.


<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=b000b000b000.wordpress.com&amp;blog=3737858&amp;post=6&amp;subd=b000b000b000&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>In a research paper from June 2007, titled &#8220;Worldwide Buzz: Planetary-Scale Views on an Instant-Messaging Network (PDF),&#8221; Eric Horvitz of Microsoft Research and Jure Leskovec of Carnegie Mellon University analyzed 30 billion conversations among 240 million people using Microsoft Instant Messenger in June 2006. It turned out that the average path length, or degree of separation, among the anonymized users probed was 6.6. (Credit: Wikipedia) Six degrees of separation posits that a person is a step away from people they know and two steps distant from people known by the people they know&#8211;thus the magic number six.</p>
<p style="text-align:center;">Planetary-Scale Views on a Large<br />
Instant-Messaging Network<br />
Jure Leskovec<br />
Carnegie Mellon University<br />
jure@cs.cmu.edu<br />
Eric Horvitz<br />
Microsoft Research<br />
horvitz@microsoft.com</p>
<p style="text-align:left;">ABSTRACT<br />
We present a study of anonymized data capturing a month<br />
of high-level communication activities within the whole of<br />
the Microsoft Messenger instant-messaging system. We ex-<br />
amine characteristics and patterns that emerge from the col-<br />
lective dynamics of large numbers of people, rather than the<br />
actions and characteristics of individuals. The dataset con-<br />
tains summary properties of 30 billion conversations among<br />
240 million people. From the data, we construct a commu-<br />
nication graph with 180 million nodes and 1.3 billion undi-<br />
rected edges, creating the largest social network constructed<br />
and analyzed to date. We report on multiple aspects of<br />
the dataset and synthesized graph. We find that the graph<br />
is well-connected and robust to node removal. We inves-<br />
tigate on a planetary-scale the oft-cited report that people<br />
are separated by “six degrees of separation” and find that<br />
the average path length among Messenger users is 6.6. We<br />
also find that people tend to communicate more with each<br />
other when they have similar age, language, and location,<br />
and that cross-gender conversations are both more frequent<br />
and of longer duration than conversations with the same<br />
gender.<br />
Categories and Subject Descriptors: H.2.8 Database<br />
Management: : Database applications – Data mining<br />
General Terms: Measurement; Experimentation.<br />
Keywords: Social networks; Communication networks; User<br />
demographics; Large data; Online communication.</p>
<p style="text-align:left;">1. INTRODUCTION<br />
Large-scale web services provide unprecedented opportu-<br />
nities to capture and analyze behavioral data on a plan-<br />
etary scale. We discuss findings drawn from aggregations<br />
of anonymized data representing one month (June 2006) of<br />
high-level communication activities of people using the Mi-<br />
crosoft Messenger instant-messaging (IM) network. We did<br />
not have nor seek access to the content of messages. Rather,<br />
we consider structural properties of a communication graph<br />
and study how structure and communication relate to user<br />
demographic attributes, such as gender, age, and location.<br />
The data set provides a unique lens for studying patterns of<br />
human behavior on a wide scale.<br />
Jure Leskovec performed this research during an internship<br />
at Microsoft Research.<br />
Copyright is held by the International World Wide Web Conference Committee<br />
(IW3C2). Distribution of these papers is limited to classroom use,<br />
and personal use by others.<br />
WWW 2008, April 21–25, 2008, Beijing, China.<br />
ACM 978-1-60558-085-2/08/04.<br />
We explore a dataset of 30 billion conversations generated<br />
by 240 million distinct users over one month. We found that<br />
approximately 90 million distinct Messenger accounts were<br />
accessed each day and that these users produced about 1 bil-<br />
lion conversations, with approximately 7 billion exchanged<br />
messages per day. 180 million of the 240 million active ac-<br />
counts had at least one conversation on the observation pe-<br />
riod. We found that 99% of the conversations occurred be-<br />
tween 2 people, and the rest with greater numbers of partic-<br />
ipants. To our knowledge, our investigation represents the<br />
largest and most comprehensive study to date of presence<br />
and communications in an IM system. A recent report [6]<br />
estimated that approximately 12 billion instant messages are<br />
sent each day. Given the estimate and the growth of IM, we<br />
estimate that we captured approximately half of the world’s<br />
IM communication during the observation period.<br />
We created an undirected communication network from<br />
the data where each user is represented by a node and an<br />
edge is placed between users if they exchanged at least one<br />
message during the month of observation. The network rep-<br />
resents accounts that were active during June 2006. In sum-<br />
mary, the communication graph has 180 million nodes, rep-<br />
resenting users who participated in at least one conversation,<br />
and 1.3 billion undirected edges among active users, where<br />
an edge indicates that a pair of people communicated. We<br />
note that this graph should be distinguished from a buddy<br />
graph where two people are connected if they appear on each<br />
other’s contact lists. The graph for the data contains 240<br />
million nodes, and 9.1 billion edges, which means an average<br />
account has approximately 50 buddies on a contact list.<br />
To highlight several of our key findings, we discovered that<br />
the communication network is well connected, with 99.9%<br />
of the nodes belonging to the largest connected component.<br />
We evaluated the oft-cited finding by Travers and Milgram<br />
that any two people are linked to one another on average<br />
via a chain with “6-degrees-of-separation” [17]. We found<br />
that the average shortest path length in the Messenger net-<br />
work is 6.6 (median 6), which is half a link more than the<br />
path length measured in the classic study. However, we<br />
also found that longer paths exist in the graph, with lengths<br />
up to 29. We observed that the network is well clustered,<br />
with a clustering coefficient [19] that decays with exponent<br />
−0.37. This decay is significantly lower than the value we<br />
had expected given prior research [11]. We found strong<br />
homophily [9, 12] among users; people have more conversa-<br />
tions and converse for longer durations with people who are<br />
similar to themselves. We find the strongest homophily for<br />
the language used, followed by conversants’ geographic lo-<br />
cations, and then age. We found that homophily does not<br />
hold for gender; people tend to converse more frequently<br />
and with longer durations with the opposite gender. We<br />
also examined the relation between communication and dis-<br />
tance, and found that the number of conversations tends to<br />
decrease with increasing geographical distance between con-<br />
versants. However, communication links spanning longer<br />
distances tend to carry more and longer conversations.<br />
2. INSTANT MESSAGING<br />
The use of IMhas been become widely adopted in personal<br />
and businesss communications. IM clients allow users fast,<br />
near-synchronous communication, placing it between syn-<br />
chronous communication mediums, such as real-time voice<br />
interactions, and asynchronous communication mediums like<br />
email [18]. IM users exchange short text messages with one<br />
or more users from their list of contacts, who have to be on-<br />
line and logged into the IM system at the time of interaction.<br />
As conversations and messages exchanged within them are<br />
usually very short, it has been observed that users employ<br />
informal language, loose grammar, numerous abbreviations,<br />
with minimal punctuation [10]. Contact lists are commonly<br />
referred to as buddy lists and users on the lists are referred<br />
to as buddies.<br />
2.1 Research on Instant Messaging<br />
Several studies on smaller datasets are related to this<br />
work. Avrahami and Hudson [3] explored communication<br />
characteristics of 16 IM users. Similarly, Shi et al. [13] ana-<br />
lyzed IM contact lists submitted by users to a public website<br />
and explored a static contact network of 140,000 people. Re-<br />
cently, Xiao et al. [20] investigated IM traffic characteristics<br />
within a large organization with 400 users ofMessenger. Our<br />
study differs from the latter study in that we analyze the full<br />
Messenger population over a one month period, capturing<br />
the interaction of user demographic attributes, communica-<br />
tion patterns, and network structure.<br />
2.2 Data description<br />
To construct the Microsoft Instant Messenger communica-<br />
tion dataset, we combined three different sources of data: (1)<br />
user demographic information, (2) time and user stamped<br />
events describing the presence of a particular user, and (3)<br />
communication session logs, where, for all participants, the<br />
number of exchanged messages and the periods of time spent<br />
participating in sessions is recorded.<br />
We use the terms session and conversation interchange-<br />
ably to refer to an IM interaction among two or more people.<br />
Although the Messenger system limits the number of peo-<br />
ple communicating at the same time to 20, people can enter<br />
and leave a conversation over time. We note that, for large<br />
sessions, people can come and go over time, so conversations<br />
can be long with many different people participating. We<br />
observed some very long sessions with more than 50 partic-<br />
ipants joining over time.<br />
All of our data was anonymized; we had no access to per-<br />
sonally identifiable information. Also, we had no access to<br />
text of the messages exchanged or any other information<br />
that could be used to uniquely identify users. We focused on<br />
analyzing high-level characteristics and patterns that emerge<br />
from the collective dynamics of 240 million people, rather<br />
than the actions and characteristics of individuals. The an-<br />
alyzed data can be split into three parts: presence data,<br />
communication data, and user demographic information:</p>
<p>• Presence events: These include login, logout, first<br />
ever login, add, remove and block a buddy, add un-<br />
registered buddy (invite new user), change of status<br />
(busy, away, be-right-back, idle, etc.). Events are user<br />
and time stamped.<br />
• Communication: For each user participating in the<br />
session, the log contains the following tuple: session id,<br />
user id, time joined the session, time left the session,<br />
number of messages sent, number of messages received.<br />
• User data: For each user, the following self-reported<br />
information is stored: age, gender, location (country,<br />
ZIP), language, and IP address. We use the IP address<br />
to decode the geographical coordinates, which we then<br />
use to position users on the globe and to calculate dis-<br />
tances.<br />
We gathered data for 30 days of June 2006. Each day<br />
yielded about 150 gigabytes of compressed text logs (4.5<br />
terabytes in total). Copying the data to a dedicated eight-<br />
processor server with 32 gigabytes of memory took 12 hours.<br />
Our log-parsing system employed a pipeline of four threads<br />
that parse the data in parallel, collapse the session join/leave<br />
events into sets of conversations, and save the data in a com-<br />
pact compressed binary format. This process compressed<br />
the data down to 45 GB per day. Processing the data took<br />
an additional 4 to 5 hours per day.<br />
A special challenge was to account for missing and dropped<br />
events, and session “id recycling” across different IM servers<br />
in a server farm. As part of this process, we closed a session<br />
48 hours after the last leave session event. We closed sessions<br />
automatically if only one user was left in the conversation.<br />
3. USAGE &amp; POPULATION STATISTICS<br />
We shall first review several statistics drawn from aggre-<br />
gations of users and their communication activities.<br />
3.1 Levels of activity<br />
Over the observation period, 242,720,596 users logged into<br />
Messenger and 179,792,538 of these users were actively en-<br />
gaged in conversations by sending or receiving at least one<br />
IM message. Over the month of observation, 17,510,905 new<br />
accounts were activated. As a representative day, on June<br />
1 2006, there were almost 1 billion (982,005,323) different<br />
sessions (conversations among any number of people), with<br />
more than 7 billion IM messages sent. Approximately 93<br />
million users logged in with 64 million different users becom-<br />
ing engaged in conversations on that day. Approximately 1.5<br />
million new users that were not registered within Microsoft<br />
Messenger were invited to join on that particular day.</p>
<p style="text-align:left;">
strongly overrepresented in the active Messenger population.<br />
Focusing on the differences by gender, females are overrep-<br />
resented for the 10–14 age interval. For male users, we see<br />
overall matches with the world population for age spans 10–<br />
14 and 35°U39; for women users, we see a match for ages in<br />
the span of 30–34. We note that 6.5% of the population did<br />
not submit an age when creating their Messenger accounts.<br />
4. COMMUNICATION CHARACTERISTICS<br />
We now focus on characteristics and patterns with com-<br />
munications. We limit the analysis to conversations between<br />
two participants, which account for 99% of all conversations.<br />
We first examine the distributions over conversation du-<br />
rations and times between conversations. Let user u have<br />
C conversations in the observation period. Then, for every<br />
conversation i of user u we create a tuple (tsu,i, teu,i,mu,i),<br />
where tsu,i denotes the start time of the conversation, teu,i<br />
is the end time of the conversation, and mu,i is the number<br />
of exchanged messages between the two users. We order the<br />
conversations by their start time (tsu,i &lt; tsu,i+1). Then,<br />
for every user u, we calculate the average conversation du-<br />
ration ¯ d(u) = 1<br />
C Pi teu,i − tsu,i, where the sum goes over<br />
all the u’s conversations. Figure 5(a) shows the distribution<br />
of ¯ d(u) over all the users u. We find that the conversation<br />
length can be described by a heavy-tailed distribution with<br />
exponent -3.7 and a mode of 4 minutes.<br />
Figure 5(b) shows the intervals between consecutive con-<br />
versations of a user. We plot the distribution of tsu,i+1 −<br />
tsu,i, where tsu,i+1 and tsu,i denote start times of two con-<br />
secutive conversations of user u. The power-law exponent of<br />
the distribution over intervals is − 1.5. This result is sim-<br />
ilar to the temporal distribution for other kinds of human<br />
communication activities, e.g., waiting times of emails and<br />
letters before a reply is generated [4]. The exponent can be<br />
explained by a priority-queue model where tasks of different</p>
<p>This model generates a task waiting time<br />
distribution described by a power-law with exponent −1.5.<br />
5. COMMUNICATION DEMOGRAPHICS<br />
Next we examine the interplay of communication and user<br />
demographic attributes, i.e., how geography, location, age,<br />
and gender influence observed communication patterns.<br />
5.1 Communication by age<br />
We sought to understand how communication among peo-<br />
ple changes with the reported ages of participating users.<br />
Figures 6(a)-(d) use a heat-map visualization to commu-<br />
nicate properties for different age–age pairs. The rows and<br />
columns represent the ages of both parties participating, and<br />
the color at each age–age cell captures the logarithm of the<br />
value for the pairing. The color spectrum extends from blue<br />
(low value) through green, yellow, and onto red (the highest<br />
value). Because of potential misreporting at very low and<br />
high ages, we concentrate on users with self-reported ages<br />
that fall between 10 and 60 years.<br />
Let a tuple (ai, bi, di,mi) denote the ith conversation in<br />
the entire dataset that occurred among users of ages ai<br />
and bi. The conversation had a duration of di seconds<br />
during which mi messages were exchanged. Let Ca,b =<br />
{(ai, bi, di,mi) : ai = a ^ bi = b} denote a set of all con-<br />
versations between users of ages a and b, respectively.<br />
Figure 6(a) shows the number of conversations among peo-<br />
ple of different ages. For every pair of ages (a, b) the color<br />
indicates the size of set Ca,b, i.e., the number of different<br />
conversations between users of ages a and b. We note that,<br />
as the notion of a conversation is symmetric, the plots are<br />
symmetric. Most conversations occur between people of ages<br />
10 to 20. The diagonal trend indicates that people tend to<br />
talk to people of similar age. This is true especially for age<br />
groups between 10 and 30 years. We shall explore this ob-<br />
servation in more detail in Section 6.</p>
<p>Here, we see that younger people have faster-paced dialogs,<br />
while older people exchange messages at a slower pace.<br />
We note that the younger population (ages 10–35) are<br />
strongly biased towards communicating with people of a<br />
similar age (diagonal trend in Figure 6(a)), and that users<br />
who report being of ages 35 years and above tend to com-<br />
municate more evenly across ages (rectangular pattern in<br />
Fig. 6(a)). Moreover, older people have conversations of the<br />
longest durations, with a “valley” in the duration of conver-<br />
sations for users of ages 25–35. Such a dip may represent<br />
shorter, faster-paced and more intensive conversations asso-<br />
ciated with work-related communications, versus more ex-<br />
tended, slower, and longer interactions associated with social<br />
discourse.<br />
5.2 Communication by gender<br />
We report on analyses of properties of pairwise communi-<br />
cations as a function of the self-reported gender of users in<br />
conversations in Table 1. Let Cg,h = {(gi, hi, di,mi) : gi =<br />
g ^hi = h} denote a set of conversations where the two par-<br />
ticipating users are of genders g and h. Note that g takes 3<br />
possible values: female, male, and unknown (unreported).<br />
Table 1(a) relays |Cg,h| for combinations of genders g and<br />
h. The table shows that approximately 50% of conversations<br />
occur between male and female and 40% of the conversations<br />
occur among users of the same gender (20% for each). A<br />
small number of conversations occur between people who<br />
did not reveal their gender.</p>
<p style="text-align:left;">The<br />
number of messages exchanged per minute of conversation<br />
for male–female conversations is higher at 1.5 messages per<br />
minute than for cross-gender conversations, where the rate<br />
is 1.43 messages per minute.<br />
We examined the number of communication ties, where a<br />
tie is established between two people when they exchange<br />
at least one message during the observation period. We<br />
computed 300 million male–male ties, 255 million female–<br />
female ties, and 640 million cross-gender ties. The Mes-<br />
senger population consists of 100 million males and 80 mil-<br />
lion females by self report. These findings demonstrate that<br />
ties are not heavily gender biased; based on the popula-<br />
tion, random chance predicts 31% male–male, 20% female–<br />
female, and 49% female–male links. We observe 25% male–<br />
male, 21% female–female, and 54% cross-gender links, thus<br />
demonstrating a minor bias of female–male links.<br />
The results reported in Table 1 run counter to prior stud-<br />
ies reporting that communication among individuals who<br />
resemble one other (same gender) occurs more often (see [9]<br />
and references therein). We identified significant heterophily,<br />
where people tend to communicate more with people of the<br />
opposite gender. However, we note that link heterogeneity<br />
was very close to the population value [8], i.e., the number of<br />
same- and cross-gender ties roughly corresponds to random<br />
chance. This shows there is no significant bias in linking<br />
for gender. However, we observe that cross-gender conver-<br />
sations tend to be longer and to include more messages,<br />
suggesting that more effort is devoted to conversations with<br />
the opposite sex.<br />
5.3 World geography and communication<br />
We now focus on the influence of geography and distance<br />
among participants on communications. Figure 7 shows the<br />
geographical locations of Messenger users. The general lo-<br />
cation of the user was obtained via reverse IP lookup. We<br />
plot all latitude/longitude positions linked to the position of<br />
servers where users log into the service. The color of each<br />
dot corresponds to the logarithm of the number of logins<br />
from the respective location, again using a spectrum of col-<br />
ors ranging from blue (low) through green and yellow to red<br />
(high). Although the maps are built solely by plotting these<br />
positions, a recognizable world map is generated. We find<br />
that North America, Europe, and Japan are very dense, with<br />
many users from those regions using Messenger. For the rest<br />
of the world, the population of Messenger users appears to<br />
reside largely in coastal regions.<br />
We can condition the densities and behaviors of Messen-<br />
ger users on multiple geographical and socioeconomic vari-<br />
ables and explore relationships between electronic commu-<br />
nications and other attributes.</p>
<p>Communication among people within different countries<br />
also varies depending on the locations of conversants. We<br />
examine two such views. Figure 10(a) shows the top coun-<br />
tries by the number of conversations between pairs of coun-<br />
tries. We examined all pairs of countries with more than<br />
10 million conversations per month. The width of edges in<br />
the figure is proportional to the logarithm of the number of<br />
conversations among the countries. We find that the United<br />
States and Spain appear to serve as hubs and that edges<br />
appear largely between historically or ethnically connected<br />
countries. As examples, Spain is connected with the Span-<br />
ish speaking countries in South America, Germany links to<br />
Turkey, Portugal to Brazil, and China to Korea.<br />
Figure 10(b) displays a similar plot where we consider<br />
country pairs by the average duration of conversations. The</p>
<p>Countries by average length of the conversation.<br />
Edge widths correspond to logarithms of intensity<br />
of links.<br />
width of the edges are proportional to the mean length of<br />
conversations between the countries. The core of the net-<br />
work appears to be Arabic countries, including Saudi Ara-<br />
bia, Egypt, United Arab Emirates, Jordan, and Syria.<br />
5.5 Communication and geographical distance<br />
We were interested in how communications change as the<br />
distance between people increases. We had hypothesized<br />
that the number of conversations would decrease with geo-<br />
graphical distance as users might be doing less coordination<br />
with one another on a daily basis, and where communication<br />
would likely require more effort to coordinate than might<br />
typically be needed for people situated more locally. We<br />
also conjectured that, once initiated, conversations among<br />
people who are farther apart would be somewhat longer as<br />
there might be a stronger need to catch up when the less-<br />
frequent conversations occurred.<br />
Figure 11 plots the relation between communication and<br />
distance. Figure 11(a) shows the distribution of the num-<br />
ber of conversations between conversants at distance l. We<br />
found that the number of conversations decreases with dis-<br />
tance. However, we observe a peak at a distance of approx-<br />
imately 500 kilometers. The other peaks and drops may re-<br />
veal geographical features. For example, a significant drop<br />
in communication at distance of 5,000 km (3,500 miles) may<br />
reflect the width of the Atlantic ocean or the distance be-<br />
tween the east and west coasts of the United States. The<br />
number of links rapidly decreases with distance. This finding<br />
suggests that users may use Messenger mainly for communi-<br />
cations with others within a local context and environment.<br />
We found that the number of exchanged messages and con-</p>
<p>Conversation duration decreases with the distance,<br />
while the number of exchanged messages remains constant<br />
before decreasing slowly. Figure 11(b) shows the commu-<br />
nications per link versus the distance among participants.<br />
The plot shows that longer links, i.e., connections between<br />
people who are farther apart, are more frequently used than<br />
shorter links. We interpret this finding to mean that peo-<br />
ple who are farther apart use Messenger more frequently to<br />
communicate.<br />
In summary, we observe that the total number of links and<br />
associated conversations decreases with increasing distance<br />
among participants. The same is true for the duration of<br />
conversations, the number of exchanged messages per con-<br />
versation, and the number of exchanged messages per unit<br />
time. However, the number of times a link is used tends<br />
to increase with the distance among users. This suggests<br />
that people who are farther apart tend to converse with IM<br />
more frequently, which perhaps takes the place of more ex-<br />
pensive long-distance voice telephony; voice might be used<br />
more frequently in lieu of IM for less expensive local com-<br />
munications.<br />
6. HOMOPHILY OF COMMUNICATION<br />
We performed several experiments to measure the level<br />
at which people tend to communicate with similar people.<br />
First, we consider all 1.3 billion pairs of people who ex-<br />
changed at least one message in June 2006, and calculate<br />
the similarity of various user demographic attributes. We<br />
contrast this with the similarity of pairs of users selected<br />
via uniform random sampling across 180 million users. We<br />
consider two measures of similarity: the correlation coeffi-<br />
cient and the probability that users have the same attribute<br />
value, e.g., that users come from the same countries.<br />
Table 2 compares correlation coefficients of various user<br />
attributes when pairs of users are chosen uniformly at ran-<br />
dom with coefficients for pairs of users who communicate.<br />
We can see that attributes are not correlated for random<br />
pairs of people, but that they are highly correlated for users</p>
<p>As we noted earlier, gender and commu-<br />
nication are slightly negatively correlated; people tend to<br />
communicate more with people of the opposite gender.<br />
Another method for identifying association is to measure<br />
the probability that a pair of users will show an exact match<br />
in values of an attribute, i.e., identifying whether two users<br />
come from the same country, speak the same language, etc.<br />
Table 2 shows the results for the probability of users sharing<br />
the same attribute value. We make similar observations as<br />
before. People who communicate are more likely to share<br />
common characteristics, including age, location, language,<br />
and they are less likely to be of the same gender. We note<br />
that the most common attribute of people who communi-<br />
cate is language. On the flip side, the amount of communi-<br />
cation tends to decrease with increasing user dissimilarity.<br />
This relationship is highlighted in Figure 11, which shows<br />
how communication among pairs of people decreases with<br />
distance.<br />
Figure 12 further illustrates the results displayed in Ta-<br />
ble 2, where we randomly sample pairs of users from the<br />
Messenger user base, and then plot the distribution over<br />
reported ages. As most of the population comes from the<br />
age group 10–30, the distribution of random pairs of people<br />
reaches the mode at those ages but there is no correlation.<br />
Figure 12(b) shows the distribution of ages over the pairs<br />
of people who communicate. Note the correlation, as repre-<br />
sented by the diagonal trend on the plot, where people tend<br />
to communicate more with others of a similar age.<br />
Next, we further explore communication patterns by the<br />
differences in the reported ages among users. Figure 13(a)<br />
plots on a log-linear scale the number of conversations in the<br />
social network with participants of varying age differences.<br />
Again we see that links and conversations are strongly cor-<br />
related with the age differences among participants. Fig-<br />
ure 13(b) shows the average conversation duration with the<br />
age difference among the users. Interestingly, the mean con-<br />
versation duration peaks at an age difference of 20 years<br />
between participants. We speculate that the peak may cor-<br />
respond roughly to the gap between generations.<br />
The plots reveal that there is strong homophily in the com-<br />
munication network for age; people tend to communicate<br />
more with people of similar reported age. This is especially<br />
salient for the number of buddies and conversations among<br />
people of the same ages. We also observe that the links<br />
between people of similar attributes are used more often,<br />
to interact with shorter and more intense (more exchanged<br />
messages) communications. The intensity of communica-<br />
tion decays linearly with the difference in age. In contrast<br />
to findings of previous studies, we observe that the num-<br />
ber of cross-gender communication links follows a random<br />
chance. However, cross-gender communication takes longer<br />
and is faster paced as it seems that people tend to pay more<br />
attention when communicating with the opposite sex.<br />
Recently, using the data we generated, Singla and Richard-<br />
son further investigated the homophily within theMessenger<br />
network and found that people who communicate are also<br />
more likely to search the web for content on similar top-<br />
ics [14].<br />
7. THE COMMUNICATION NETWORK<br />
So far we have examined communication patterns based<br />
on pairwise communications. We now create a more general<br />
communication network from the data. Using this network,<br />
we can examine the typical social distance between people,<br />
i.e., the number of links that separate a random pair of<br />
people. This analysis seeks to understand how many peo-<br />
ple can be reached within certain numbers of hops among<br />
people who communicate. Also, we test the transitivity of<br />
the network, i.e., the degree at which pairs with a common<br />
friend tend to be connected.<br />
We constructed a graph from the set of all two-user con-<br />
versations, where each node corresponds to a person and<br />
there is an undirected edge between a pair of nodes if the<br />
users were engaged in an active conversation during the ob-<br />
servation period (users exchanged at least 1 message). The<br />
resulting network contains 179,792,538 nodes, and 1,342,246,427<br />
edges. Note that this is not simply a buddy network; we<br />
only connect people who are buddies and have communi-<br />
cated during the observation period.<br />
Figures 14–15 show the structural properties of the com-<br />
munication network. The network degree distribution shown<br />
in Figure 14(a) is heavy tailed but does not follow a power-<br />
law distribution. Using maximum likelihood estimation, we<br />
fit a power-law with exponential cutoff p(k) / k−ae−bk with<br />
fitted parameter values a = 0.8 and b = 0.03. We found a<br />
strong cutoff parameter and low power-law exponent, sug-<br />
gesting a distribution with high variance.<br />
Figure 14(b) displays the degree distribution of a buddy<br />
graph. We did not have access to the full buddy network;<br />
we only had access to data on the length of the user contact<br />
list which allowed us to create the plot. We found a total<br />
of 9.1 billion buddy edges in the graph with 49 buddies per<br />
user. We fit the data with a power-law distribution with<br />
exponential cutoff and identified parameters of a = 0.6 and<br />
b = 0.01. The power-law exponent now is even smaller.<br />
This model described the data well. We note a spike at<br />
600 which is the limit on the maximal number of buddies<br />
imposed by the Messenger software client. The maximal<br />
number of buddies was increased to 300 from 150 in March</p>
<p style="text-align:left;">
99.9% of the nodes<br />
belong to the largest connected component.<br />
2005, and was later raised to 600. With the data from June<br />
2006, we see only the peak at 600, and could not identify<br />
bumps at the earlier constraints.<br />
Social networks have been found to be highly transitive,<br />
i.e., people with common friends tend to be friends them-<br />
selves. The clustering coefficient [19] has been used as a<br />
measure of transitivity in the network. The measure is de-<br />
fined as the fraction of triangles around a node of degree<br />
k [19]. Figure 15(a) displays the clustering coefficient ver-<br />
sus the degree of a node for Messenger. Previous results<br />
on measuring the web graph as well as theoretical analyses<br />
show that the clustering coefficient decays as k−1 (exponent<br />
−1) with node degree k [11]. For the Messenger network,<br />
the clustering coefficient decays very slowly with exponent<br />
−0.37 with the degree of a node and the average clustering<br />
coefficient is 0.137. This result suggests that clustering in<br />
the Messenger network is much higher than expected—that<br />
people with common friends also tend to be connected. Fig-<br />
ure 15(b) displays the distribution of the connected compo-<br />
nents in the network. The giant component contains 99.9%<br />
of the nodes in the network against a background of small<br />
components, and the distribution follows a power law.<br />
7.1 How small is the small-world?<br />
Messenger data gives us a unique opportunity to study<br />
distances in the social network. To our knowledge, this is the<br />
first time a planetary-scale social network has been available<br />
to validate the well-known “6 degrees of separation” finding<br />
by Travers and Milgram [17]. The earlier work employed<br />
a sample of 64 people and found that the average number<br />
of hops for a letter to travel from Nebraska to Boston was<br />
6.2 (mode 5, median 5), which is popularly known as the “6<br />
degrees of separation” among people. We used a population<br />
sample that is more than two million times larger than the<br />
group studied earlier and confirmed the classic finding.</p>
<p>To approximate the distribution of the dis-<br />
tances, we randomly sampled 1000 nodes and calculated for<br />
each node the shortest paths to all other nodes. We found<br />
that the distribution of path lengths reaches the mode at<br />
6 hops and has a median at 7. The average path length is<br />
6.6. This result means that a random pair of nodes in the<br />
Messenger network is 6.6 hops apart on the average, which is<br />
half a link longer than the length measured by Travers and<br />
Milgram. The 90th percentile (effective diameter [16]) of the<br />
distribution is 7.8. 48% of nodes can be reached within 6<br />
hops and 78% within 7 hops. So, we might say that, via the<br />
lens provided on the world by Messenger, we find that there<br />
are about “7 degrees of separation” among people. We note<br />
that long paths, i.e., nodes that are far apart, exist in the<br />
network; we found paths up to a length of 29.<br />
7.2 Network cores<br />
We further study connectivity of the communication net-<br />
work by examining the k-cores [5] of the graph. The concept<br />
of k-core is a generalization of the giant connected compo-<br />
nent. The k-core of a network is a set of vertices K, where<br />
each vertex in K has at least k edges to other vertices in<br />
K. The distribution of k-core sizes gives us an idea of how<br />
quickly the network shrinks as we move towards the core.<br />
The k-core of a graph can be obtained by deleting from<br />
the network all vertices of degree less than k. This process<br />
will decrease degrees of some non-deleted vertices, so more<br />
vertices will have degree less than k. We keep pruning ver-<br />
tices until all remaining vertices have degree of at least k.<br />
We call the remaining vertices a k-core.<br />
Figure 16 plots the number of nodes in a core of order<br />
k. We note that the core sizes are remarkably stable up to<br />
a value of k  20; the number of nodes in the core drops<br />
for only an order of magnitude. After k &gt; 20, the core<br />
size rapidly drops. The central part of the communication<br />
network is composed of 79 nodes, where each of them has<br />
more than 68 edges inside the set. The structure of the<br />
Messenger communication network is quite different from<br />
the Internet graph; it has been observed [2] that the size of<br />
a k-core of the Internet decays as a power-law with k. Here<br />
we see that the core sizes remains very stable up to a degree<br />
 20, and only then start to rapidly degrease. This means<br />
that the nodes with degrees of less than 20 are on the fringe<br />
of the network, and that the core starts to rapidly decrease<br />
as nodes of degree 20 or more are deleted.<br />
7.3 Strength of the ties<br />
It has been observed by Albert et al. [1] that many real-<br />
world networks are robust to node-level changes or attacks.<br />
Researchers have showed that networks like the World Wide<br />
Web, Internet, and several social networks display a high<br />
degree of robustness to random node removals, i.e., one has<br />
to remove many nodes chosen uniformly at random to make<br />
the network disconnected. On the contrary, targeted attacks<br />
are very effective. Removing a few high degree nodes can<br />
have a dramatic influence on the connectivity of a network.<br />
Let us now study how the Messenger communication net-<br />
work is decomposed when “strong,” i.e., heavily used, edges<br />
are removed from the network. We consider several different<br />
definitions of “heavily used,” and measure the types of edges<br />
that are most important for network connectivity. We note<br />
that a similar experiment was performed by Shi et al [13]<br />
in the context of a small IM buddy network. The authors<br />
of the prior study took the number of common friends at<br />
the ends of an edge as a measure of the link strength. As<br />
the number of edges here is too large (1.3 billion) to remove<br />
edges one by one, we employed the following procedure: We<br />
order the nodes by decreasing value per a measure of the<br />
intensity of engagement of users; we then delete nodes as-<br />
sociated with users in order of decreasing measure and we<br />
observe the evolution of the properties of the communication<br />
network as nodes are deleted.<br />
We consider the following different measures of engage-<br />
ment:<br />
• Average sent: The average number of sent messages<br />
per user’s conversation<br />
• Average time: The average duration of user’s conver-<br />
sations<br />
• Links: The number of links of a user (node degree),<br />
i.e., number of different people he or she exchanged<br />
messages with<br />
• Conversations: The total number of conversations of a<br />
user in the observation period<br />
• Sent messages: The total number of sent messages by<br />
a user in the observation period<br />
• Sent per unit time: The number of sent messages per<br />
unit time of a conversation<br />
• Total time: The total conversation time of a user in<br />
the observation period<br />
At each step of the experiment, we remove 10 million<br />
nodes in order of the specific measure of engagement being<br />
studied. We then determine the relative size of the largest<br />
connected component, i.e., given the network at particu-<br />
lar step, we find the fraction of the nodes belonging to the<br />
largest connected component of the network.<br />
Figure 17 plots the evolution of the fraction of nodes in<br />
the largest connected component with the number of deleted<br />
nodes. We plot a separate curve for each of the seven dif-<br />
ferent measures of engagement. For comparison, we also<br />
consider the random deletion of the nodes.<br />
The decomposition procedure highlighted two types of dy-<br />
namics of network change with node removal. The size of the<br />
largest component decreases rapidly when we use as mea-<br />
sures of engagement the number of links, number of conver-<br />
sations, total conversation time, or number of sent messages.<br />
In contrast, the size of the largest component decreases very<br />
slowly when we use as a measure of engagement the average</p>
<p>time per conversation, average number of sent messages, or<br />
number of sent messages per unit time. We were not sur-<br />
prised to find that the size of the largest component size de-<br />
creases most rapidly when nodes are deleted in order of the<br />
decreasing number of links that they have, i.e., the number<br />
of people with whom a user at a node communicates. Ran-<br />
dom ordering of the nodes shrinks the component at the<br />
slowest rate. After removing 160 million out of 180 million<br />
nodes with the random policy, the largest component still<br />
contains about half of the nodes. Surprisingly, when deleting<br />
up to 100 million nodes, the average time per conversation<br />
measure shrinks the component even more slowly than the<br />
random deletion policy.</p>
<p>We have reviewed a set of results stemming from the gen-<br />
eration and analysis of an anonymized dataset representing<br />
the communication patterns of all people using a popular<br />
IM system. The methods and findings highlight the value of<br />
using a large IM network as a worldwide lens onto aggregate<br />
human behavior.<br />
We described the creation of the dataset, capturing high-<br />
level communication activities and demographics in June<br />
2006. The core dataset contains more than 30 billion conver-<br />
sations among 240 million people. We discussed the creation<br />
and analysis of a communication graph from the data con-<br />
taining 180 million nodes and 1.3 billion edges. The commu-<br />
nication network is largest social network analyzed to date.<br />
The planetary-scale network allowed us to explore dependen-<br />
cies among user demographics, communication characteris-<br />
tics, and network structure. Working with such a massive<br />
dataset allowed us to test hypotheses such as the average<br />
chain of separation among people across the entire world.<br />
We discovered that the graph is well connected, highly<br />
transitive, and robust. We reviewed the influence of multi-<br />
ple factors on communication frequency and duration. We<br />
found strong influences of homophily in activities, where<br />
people with similar characteristics tend to communicate more,<br />
with the exception of gender, where we found that cross-<br />
gender conversations are both more frequent and of longer<br />
duration than conversations with users of the same reported<br />
gender. We also examined the path lengths and validated<br />
on a planetary scale earlier research that found “6 degrees<br />
of separation” among people.<br />
We note that the sheer size of the data limits the kinds<br />
of analyses one can perform. In some cases, a smaller ran-<br />
dom sample may avoid the challenges with working with<br />
terabytes of data. However, it is known that sampling can<br />
corrupt the structural properties of networks, such as the de-<br />
gree distribution and the diameter of the graphs [15]. Thus,<br />
while sampling may be valuable for managing complexity<br />
of analyses, results on network properties with partial data<br />
sets may be rendered unreliable. Furthermore, we need to<br />
consider the full data set to reliably measure the patterns of<br />
age and distance homophily in communications.<br />
In other directions of research with the dataset, we have<br />
pursued the use of machine learning and inference to learn<br />
predictive models that can forecast such properties as com-<br />
munication frequencies and durations of conversations among<br />
people as a function of the structural and demographic at-<br />
tributes of conversants. Our future directions for research<br />
include gaining an understanding of the dynamics of the<br />
structure of the communication network via a study of the<br />
evolution of the network over time.<br />
We hope that our studies with Messenger data serves as<br />
an example of directions in social science research, highlight-<br />
ing how communication systems can provide insights about<br />
high-level patterns and relationships in human communica-<br />
tions without making incursions into the privacy of individ-<br />
uals. We hope that this first effort to understand a social<br />
network on a genuinely planetary scale will embolden others<br />
to explore human behavior at large scales.<br />
Acknowledgments<br />
We thank Dan Liebling for help with generated world map<br />
plots, and Dimitris Achlioptas and Susan Dumais for helpful<br />
suggestions.<br />
9. REFERENCES<br />
[1] R. Albert, H. Jeong, and A.-L. Barabasi. Error and<br />
attack tolerance of complex networks. Nature, 406:378,<br />
2000.<br />
[2] J. I. Alvarez-Hamelin, L. Dall’Asta, A. Barrat, and<br />
A. Vespignani. Analysis and visualization of large scale<br />
networks using the k-core decomposition. In ECCS<br />
’05: European Conference on Complex Systems, 2005.<br />
[3] D. Avrahami and S. E. Hudson. Communication<br />
characteristics of instant messaging: effects and<br />
predictions of interpersonal relationships. In CSCW<br />
’06, pages 505–514, 2006.<br />
[4] A.-L. Barabasi. The origin of bursts and heavy tails in<br />
human dynamics. Nature, 435:207, 2005.<br />
[5] V. Batagelj and M. Zaversnik. Generalized cores.<br />
ArXiv, (cs.DS/0202039), Feb 2002.<br />
[6] IDC Market Analysis. Worldwide Enterprise Instant<br />
Messaging Applications 2005–2009 Forecast and 2004<br />
Vendor Shares: Clearing the Decks for Substantial<br />
Growth. 2005.<br />
[7] J. Leskovec and E. Horvitz. Worldwide Buzz:<br />
Planetary-Scale Views on an Instant-Messaging<br />
Network. Tech. report MSR-TR-2006-186, 2006.<br />
[8] P. V. Marsden. Core discussion networks of americans.<br />
American Sociological Review, 52(1):122–131, 1987.<br />
[9] M. McPherson, L. Smith-Lovin, and J. M. Cook.<br />
Birds of a feather: Homophily in social networks.<br />
Annual Review of Sociology, 27(1):415–444, 2001.<br />
[10] B. A. Nardi, S. Whittaker, and E. Bradner.<br />
Interaction and outeraction: instant messaging in<br />
action. In CSCW ’00: Proceedings of the 2000 ACM<br />
conference on Computer supported cooperative work,<br />
pages 79–88, 2000.<br />
[11] E. Ravasz and A.-L. Barabasi. Hierarchical<br />
organization in complex networks. Physical Review E,<br />
67(2):026112, 2003.<br />
[12] E. M. Rogers and D. K. Bhowmik.<br />
Homophily-heterophily: Relational concepts for<br />
communication research. Public Opinion Quarterly,<br />
34:523–538, 1970.<br />
[13] X. Shi, L. A. Adamic, and M. J. Strauss. Networks of<br />
strong ties. Physica A Statistical Mechanics and its<br />
Applications, 378:33–47, May 2007.<br />
[14] P. Singla and M. Richardson. Yes, there is a<br />
correlation &#8211; from social networks to personal behavior<br />
on the web. In WWW ’08, 2008.<br />
[15] M. P. Stumpf, C. Wiuf, R. M. May. Subnets of<br />
scale-free networks are not scale-free: sampling<br />
properties of networks. PNAS, 102(12), 2005.<br />
[16] S. L. Tauro, C. Palmer, G. Siganos, and M. Faloutsos.<br />
A simple conceptual model for the internet topology.<br />
In GLOBECOM ’01, vol. 3, pages 1667 – 1671, 2001.<br />
[17] J. Travers and S. Milgram. An experimental study of<br />
the small world problem. Sociometry, 32(4), 1969.<br />
[18] A. Voida, W. C. Newstetter, and E. D. Mynatt. When<br />
conventions collide: the tensions of instant messaging<br />
attributed. In CHI ’02, pages 187–194, 2002.<br />
[19] D. J. Watts and S. H. Strogatz. Collective dynamics of<br />
’small-world’ networks. Nature, 393:440–442, 1998.<br />
[20] Z. Xiao, L. Guo, and J. Tracey. Understanding instant<br />
messaging traffic characteristics. In ICDCS ’07, 2007.</p>
<br /><img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/b000b000b000.wordpress.com/6/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/b000b000b000.wordpress.com/6/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/b000b000b000.wordpress.com/6/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/b000b000b000.wordpress.com/6/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/b000b000b000.wordpress.com/6/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/b000b000b000.wordpress.com/6/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/b000b000b000.wordpress.com/6/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/b000b000b000.wordpress.com/6/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/b000b000b000.wordpress.com/6/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/b000b000b000.wordpress.com/6/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/b000b000b000.wordpress.com/6/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/b000b000b000.wordpress.com/6/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/b000b000b000.wordpress.com/6/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/b000b000b000.wordpress.com/6/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/b000b000b000.wordpress.com/6/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/b000b000b000.wordpress.com/6/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=b000b000b000.wordpress.com&amp;blog=3737858&amp;post=6&amp;subd=b000b000b000&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://b000b000b000.wordpress.com/2008/07/03/66-is-the-new-6/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/1bea80cb051e17ded449d47bc57533fc?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">a000a000a000</media:title>
		</media:content>
	</item>
		<item>
		<title>The hotel problem</title>
		<link>http://b000b000b000.wordpress.com/2008/05/16/the-hotel-problem/</link>
		<comments>http://b000b000b000.wordpress.com/2008/05/16/the-hotel-problem/#comments</comments>
		<pubDate>Fri, 16 May 2008 11:04:08 +0000</pubDate>
		<dc:creator>a000a000a000</dc:creator>
				<category><![CDATA[Google Analytics: What it is]]></category>
		<category><![CDATA[aurangabad]]></category>
		<category><![CDATA[banana bar]]></category>
		<category><![CDATA[bandra]]></category>
		<category><![CDATA[bihar]]></category>
		<category><![CDATA[computer peripherals]]></category>
		<category><![CDATA[dilhi jal board]]></category>
		<category><![CDATA[excelan]]></category>
		<category><![CDATA[hastinapur]]></category>
		<category><![CDATA[hawa mahal]]></category>
		<category><![CDATA[jaipur]]></category>
		<category><![CDATA[jharkhand]]></category>
		<category><![CDATA[kanwal@rekhi.com]]></category>
		<category><![CDATA[monza bar]]></category>
		<category><![CDATA[motihari]]></category>
		<category><![CDATA[naman chakraborty]]></category>
		<category><![CDATA[namanchakraborty]]></category>
		<category><![CDATA[New visitors + repeat visitors unequal to total visitor]]></category>
		<category><![CDATA[rajasthan]]></category>
		<category><![CDATA[rawatbhata]]></category>
		<category><![CDATA[santa cruz]]></category>
		<category><![CDATA[The hotel problem. analytics]]></category>
		<category><![CDATA[thermal power project]]></category>
		<category><![CDATA[TIE]]></category>
		<category><![CDATA[uttar pradesh]]></category>
		<category><![CDATA[uttaranchal]]></category>
		<category><![CDATA[web analytics]]></category>
		<category><![CDATA[www.google.com]]></category>
		<category><![CDATA[www.wikipedia.org]]></category>
		<category><![CDATA[www.yahoo.com]]></category>

		<guid isPermaLink="false">http://b000b000b000.wordpress.com/?p=4</guid>
		<description><![CDATA[New visitors + repeat visitors unequal to total visitors

 
<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=b000b000b000.wordpress.com&amp;blog=3737858&amp;post=4&amp;subd=b000b000b000&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p class="MsoNormal" style="margin:0;"><span style="font-size:8pt;">A very famous problem in analytics. It deals with the problem of the cumulative total of unique visitor, per day of the month not adding up to the total number of unique visitors that month. It has baffled the new analytics professionals for a long time and probably is the first problem that is encountered, by a new analytics professional.</span></p>
<p class="MsoNormal" style="margin:0;"><span style="font-size:8pt;">An inexperienced user may tend to take it as an anomaly with the software and the </span></p>
<p><span style="font-size:8pt;">The way to picture the situation is by imagining a hotel. The hotel has two rooms (Room A and Room B).</span></p>
<table class="MsoNormalTable" border="1" cellspacing="0" cellpadding="0">
<tbody>
<tr>
<td style="background-color:transparent;border:#f1f3f8;padding:3pt;">
<p class="MsoNormal" style="margin:0;"><span style="font-size:8pt;"> </span></p>
</td>
<td style="background-color:transparent;border:#f1f3f8;padding:3pt;">
<p class="MsoNormal" style="margin:0;"><span style="font-size:8pt;">Day 1</span></p>
</td>
<td style="background-color:transparent;border:#f1f3f8;padding:3pt;">
<p class="MsoNormal" style="margin:0;"><span style="font-size:8pt;">Day 2</span></p>
</td>
<td style="background-color:transparent;border:#f1f3f8;padding:3pt;">
<p class="MsoNormal" style="margin:0;"><span style="font-size:8pt;">Day 3</span></p>
</td>
<td style="background-color:transparent;border:#f1f3f8;padding:3pt;">
<p class="MsoNormal" style="margin:0;"><span style="font-size:8pt;">Total</span></p>
</td>
</tr>
<tr>
<td style="background-color:transparent;border:#f1f3f8;padding:3pt;">
<p class="MsoNormal" style="margin:0;"><span style="font-size:8pt;">Room A</span></p>
</td>
<td style="background-color:transparent;border:#f1f3f8;padding:3pt;">
<p class="MsoNormal" style="margin:0;"><span style="font-size:8pt;">John</span></p>
</td>
<td style="background-color:transparent;border:#f1f3f8;padding:3pt;">
<p class="MsoNormal" style="margin:0;"><span style="font-size:8pt;">John</span></p>
</td>
<td style="background-color:transparent;border:#f1f3f8;padding:3pt;">
<p class="MsoNormal" style="margin:0;"><span style="font-size:8pt;">Jane</span></p>
</td>
<td style="background-color:transparent;border:#f1f3f8;padding:3pt;">
<p class="MsoNormal" style="margin:0;"><span style="font-size:8pt;">2 Unique Users</span></p>
</td>
</tr>
<tr>
<td style="background-color:transparent;border:#f1f3f8;padding:3pt;">
<p class="MsoNormal" style="margin:0;"><span style="font-size:8pt;">Room B</span></p>
</td>
<td style="background-color:transparent;border:#f1f3f8;padding:3pt;">
<p class="MsoNormal" style="margin:0;"><span style="font-size:8pt;">Mark</span></p>
</td>
<td style="background-color:transparent;border:#f1f3f8;padding:3pt;">
<p class="MsoNormal" style="margin:0;"><span style="font-size:8pt;">Jane</span></p>
</td>
<td style="background-color:transparent;border:#f1f3f8;padding:3pt;">
<p class="MsoNormal" style="margin:0;"><span style="font-size:8pt;">Mark</span></p>
</td>
<td style="background-color:transparent;border:#f1f3f8;padding:3pt;">
<p class="MsoNormal" style="margin:0;"><span style="font-size:8pt;">2 Unique Users</span></p>
</td>
</tr>
<tr>
<td style="background-color:transparent;border:#f1f3f8;padding:3pt;">
<p class="MsoNormal" style="margin:0;"><span style="font-size:8pt;">Total</span></p>
</td>
<td style="background-color:transparent;border:#f1f3f8;padding:3pt;">
<p class="MsoNormal" style="margin:0;"><span style="font-size:8pt;">2</span></p>
</td>
<td style="background-color:transparent;border:#f1f3f8;padding:3pt;">
<p class="MsoNormal" style="margin:0;"><span style="font-size:8pt;">2</span></p>
</td>
<td style="background-color:transparent;border:#f1f3f8;padding:3pt;">
<p class="MsoNormal" style="margin:0;"><span style="font-size:8pt;">2</span></p>
</td>
<td style="background-color:transparent;border:#f1f3f8;padding:3pt;">
<p class="MsoNormal" style="margin:0;"><span style="font-size:8pt;"> ?</span></p>
</td>
</tr>
</tbody>
</table>
<p class="MsoNormal" style="margin:0;"><span style="font-size:8pt;"> </span></p>
<p class="MsoNormal" style="margin:0;"><span style="font-size:8pt;"> </span></p>
<p class="MsoNormal" style="margin:0;"><span style="font-size:8pt;">Thus there is a concept which is as follows:</span></p>
<p class="MsoNormal" style="margin:0;"><span class="mw-headline"><strong><span style="text-decoration:underline;"><span style="font-size:8pt;">New visitors + repeat visitors unequal to total visitors</span></span></strong></span></p>
<p class="MsoNormal" style="margin:0;"><span class="mw-headline"><strong><span style="text-decoration:underline;"><span style="font-size:8pt;"><span style="text-decoration:none;"> </span></span></span></strong></span></p>
<p><span style="font-size:8pt;">Another common misconception in web analytics is that the sum of the new visitors and the repeat visitors ought to be the total number of visitors. Again this becomes clear if the visitors are viewed as individuals on a small scale, but still causes a large number of complaints that analytics software cannot be working because of a failure to understand the metrics.</span></p>
<p><span style="font-size:8pt;">Here the culprit is the metric of a new visitor. There is really no such thing as a new visitor when you are considering a web site from an ongoing perspective. If a visitor makes their first visit on a given day and then returns to the web site on the same day they are both a new visitor and a repeat visitor for that day. So if we look at them as an individual which are they? The answer has to be both, so the definition of the metric is at fault.</span></p>
<p><span style="font-size:8pt;">A new visitor is not an individual; it is a fact of the web measurement. For this reason it is easiest to conceptualize the same facet as a first visit (or first session). This resolves the conflict and so removes the confusion. Nobody expects the number of first visits to add to the number of repeat visitors to give the total number of visitors. The metric will have the same number as the new visitors, but it is clearer that it will not add in this fashion.</span></p>
<p><span style="font-size:8pt;">On the day in question there was a first visit made by our chosen individual. There was also a repeat visit made by the same individual. The number of first visits and the number of repeat visits will add up to the total number of visits for that day.</span></p>
<p><em><span style="text-decoration:underline;"><span style="font-size:8pt;">Some related links on www.wikipedia.com</span></span></em></p>
<ul type="disc">
<li class="MsoNormal"><span style="font-size:8pt;"><a title="Web log analysis software" href="http://en.wikipedia.org/wiki/Web_log_analysis_software">Web log analysis software</a> </span></li>
<li class="MsoNormal"><span style="font-size:8pt;"><a title="Web bug" href="http://en.wikipedia.org/wiki/Web_bug">Web bug</a> </span></li>
<li class="MsoNormal"><span style="font-size:8pt;"><a title="Business Intelligence" href="http://en.wikipedia.org/wiki/Business_Intelligence">Business Intelligence</a> </span></li>
<li class="MsoNormal"><span style="font-size:8pt;"><a title="Customer engagement" href="http://en.wikipedia.org/wiki/Customer_engagement">Customer engagement</a> </span></li>
<li class="MsoNormal"><span style="font-size:8pt;"><a title="Win-loss analytics" href="http://en.wikipedia.org/wiki/Win-loss_analytics">Win-loss analytics</a> </span></li>
<li class="MsoNormal"><span style="font-size:8pt;"><a title="Emetrics Summit" href="http://en.wikipedia.org/wiki/Emetrics_Summit">Emetrics Summit</a> </span></li>
<li class="MsoNormal"><span style="font-size:8pt;"><a title="Path analysis (computing)" href="http://en.wikipedia.org/wiki/Path_analysis_%28computing%29">Path analysis</a></span></li>
</ul>
<ul type="disc">
<li class="MsoNormal"><span style="font-size:8pt;"><a title="http://www.ga-experts.co.uk/web-data-sources.pdf" href="http://www.ga-experts.co.uk/web-data-sources.pdf">Web Traffic Data Sources and Vendor Comparison</a> by GA Experts </span></li>
</ul>
<ul type="disc">
<li class="MsoNormal"><cite><span style="font-size:8pt;">Naor, Moni; Pinkas, Benny (1998). &#8220;<a title="http://www.wisdom.weizmann.ac.il/%7Enaor/PAPERS/meter_abs.html" href="http://www.wisdom.weizmann.ac.il/%7Enaor/PAPERS/meter_abs.html">Secure and Efficient Metering</a>&#8220;. </span></cite><cite><span style="font-size:8pt;">Advances in Cryptology &#8211; EUROCRYPT 1998: International Conference on the Theory and application of Cryptographic Techniques</span></cite><cite><span style="font-size:8pt;">. Retrieved on <a title="2007" href="http://en.wikipedia.org/wiki/2007">2007</a>-<a title="December 27" href="http://en.wikipedia.org/wiki/December_27">12-27</a>.</span></cite><span title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=proceeding&amp;rft.btitle=Advances+in+Cryptology+-+EUROCRYPT+1998%3A+International+Conference+on+the+Theory+and+application+of+Cryptographic+Techniques&amp;rft.atitle=Secure+and+Efficient+Metering&amp;rft.aulast=Naor&amp;rft.aufirst=Moni&amp;rft.date=1998&amp;rft_id=http%3A%2F%2Fwww.wisdom.weizmann.ac.il%2F%257Enaor%2FPAPERS%2Fmeter_abs.html"><span class="z3988"><span style="font-size:8pt;"> </span></span></span><span style="font-size:8pt;"> </span></li>
</ul>
<ul type="disc">
<li class="MsoNormal"><cite><span style="font-size:8pt;">Naor, Moni; Pinkas, Benny (April 14-18, 1998). &#8220;<a title="http://www.pinkas.net/PAPERS/www7paper/p336.htm" href="http://www.pinkas.net/PAPERS/www7paper/p336.htm">Secure Accounting and Auditing on the Web</a>&#8220;. </span></cite><cite><span style="font-size:8pt;">Seventh International World-Wide Web (WWW) Conference, 1998</span></cite><cite><span style="font-size:8pt;">. Retrieved on <a title="2007" href="http://en.wikipedia.org/wiki/2007">2007</a>-<a title="December 27" href="http://en.wikipedia.org/wiki/December_27">12-27</a>.</span></cite><span title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=proceeding&amp;rft.btitle=Seventh+International+World-Wide+Web+%28WWW%29+Conference%2C+1998&amp;rft.atitle=Secure+Accounting+and+Auditing+on+the+Web&amp;rft.aulast=Naor&amp;rft.aufirst=Moni&amp;rft.date=April+14-18%2C+1998&amp;rft_id=http%3A%2F%2Fwww.pinkas.net%2FPAPERS%2Fwww7paper%2Fp336.htm"><span class="z3988"><span style="font-size:8pt;"> </span></span></span><span style="font-size:8pt;"> </span></li>
</ul>
<ul type="disc">
<li class="MsoNormal"><cite><span style="font-size:8pt;">Franklin, Matthew; Malkhi, Dahlia (<a title="1998" href="http://en.wikipedia.org/wiki/1998">1998</a>). &#8220;<a title="http://research.microsoft.com/~dalia/pubs/meter-ftp.ps" href="http://research.microsoft.com/~dalia/pubs/meter-ftp.ps">Auditable Metering with Lightweight Security</a>&#8221; (<a title="PostScript" href="http://en.wikipedia.org/wiki/PostScript">PostScript</a>). </span></cite><cite><span style="font-size:8pt;">Journal of Computer Security, 1998</span></cite><cite><span style="font-size:8pt;"> <strong>6</strong> (4): 237-256. Retrieved on <a title="2007" href="http://en.wikipedia.org/wiki/2007">2007</a>-<a title="December 27" href="http://en.wikipedia.org/wiki/December_27">12-27</a>.</span></cite><span title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&amp;rft.genre=article&amp;rft.atitle=Auditable+Metering+with+Lightweight+Security&amp;rft.jtitle=Journal+of+Computer+Security%2C+1998&amp;rft.date=%5B%5B1998%5D%5D&amp;rft.volume=6&amp;rft.issue=4&amp;rft.aulast=Franklin&amp;rft.aufirst=Matthew&amp;rft.pages=237-256&amp;rft_id=http%3A%2F%2Fresearch.microsoft.com%2F%7Edalia%2Fpubs%2Fmeter-ftp.ps"><span class="z3988"><span style="display:none;font-size:8pt;"> </span></span></span><span style="font-size:8pt;"> </span></li>
</ul>
<br /><img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/b000b000b000.wordpress.com/4/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/b000b000b000.wordpress.com/4/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/b000b000b000.wordpress.com/4/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/b000b000b000.wordpress.com/4/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/b000b000b000.wordpress.com/4/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/b000b000b000.wordpress.com/4/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/b000b000b000.wordpress.com/4/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/b000b000b000.wordpress.com/4/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/b000b000b000.wordpress.com/4/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/b000b000b000.wordpress.com/4/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/b000b000b000.wordpress.com/4/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/b000b000b000.wordpress.com/4/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/b000b000b000.wordpress.com/4/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/b000b000b000.wordpress.com/4/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/b000b000b000.wordpress.com/4/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/b000b000b000.wordpress.com/4/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=b000b000b000.wordpress.com&amp;blog=3737858&amp;post=4&amp;subd=b000b000b000&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://b000b000b000.wordpress.com/2008/05/16/the-hotel-problem/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/1bea80cb051e17ded449d47bc57533fc?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">a000a000a000</media:title>
		</media:content>
	</item>
		<item>
		<title>What the hell is Google Analytics?</title>
		<link>http://b000b000b000.wordpress.com/2008/05/15/what-the-hell-is-google-analytics/</link>
		<comments>http://b000b000b000.wordpress.com/2008/05/15/what-the-hell-is-google-analytics/#comments</comments>
		<pubDate>Thu, 15 May 2008 21:21:32 +0000</pubDate>
		<dc:creator>a000a000a000</dc:creator>
				<category><![CDATA[Google Analytics: What it is]]></category>
		<category><![CDATA[analytics]]></category>
		<category><![CDATA[ashit joshi]]></category>
		<category><![CDATA[bhagalpur]]></category>
		<category><![CDATA[bihar]]></category>
		<category><![CDATA[cheap washing machines]]></category>
		<category><![CDATA[dadar and nagar haveli]]></category>
		<category><![CDATA[google]]></category>
		<category><![CDATA[google analytics]]></category>
		<category><![CDATA[hari rastogi]]></category>
		<category><![CDATA[haryana]]></category>
		<category><![CDATA[how would i know what to type]]></category>
		<category><![CDATA[naman chakraborty]]></category>
		<category><![CDATA[namanchakraborty]]></category>
		<category><![CDATA[om prakash chautala]]></category>
		<category><![CDATA[patna]]></category>
		<category><![CDATA[www.google.com]]></category>
		<category><![CDATA[www.wikipedia.com]]></category>
		<category><![CDATA[www.yahoo.com]]></category>

		<guid isPermaLink="false">http://b000b000b000.wordpress.com/?p=3</guid>
		<description><![CDATA[Well, according to www.wikipedia.com, it is a free service which is offered by Google, that generates detailed statistics about the visitors to the website. <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=b000b000b000.wordpress.com&amp;blog=3737858&amp;post=3&amp;subd=b000b000b000&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p><span style="font-size:8pt;color:#000000;"><img style="vertical-align:top;" src="http://farm1.static.flickr.com/197/503530844_bd2565de8c.jpg?v=0" alt="google analytics logo" width="420" height="84" /></span></p>
<p> </p>
<p>_uacct = &#8220;UA-4428128-1&#8243;;<br />
urchinTracker();</p>
<p><span style="font-size:8pt;color:#000000;"><img style="vertical-align:middle;" src="http://www.noscope.com/media/analytics.jpg" alt="google analytics logo" width="530" height="561" /></span></p>
<p><span style="font-size:8pt;color:#000000;">Well, according to <a href="http://www.wikipedia.com"><span class="mceitemhiddenspellword1">www</span><span class="mceitemhidden">.</span><span class="mceitemhiddenspellword1">wikipedia</span><span class="mceitemhidden">.com</span></a><span class="mceitemhidden">, it is a free service which is offered by </span><span class="mceitemhiddenspellword1">Google</span><span class="mceitemhidden">, that generates detailed statistics about the visitors to the website. Here is the </span><a title="Google Analytics on Wikipedia" href="http://en.wikipedia.org/wiki/Google_Analytics" target="_blank">link</a>!</span></p>
<p><span style="font-size:8pt;">Its main highlight is that a <a title="Webmaster" href="http://en.wikipedia.org/wiki/Webmaster">webmaster</a> can optimize <a title="AdWords" href="http://en.wikipedia.org/wiki/AdWords">AdWords</a> advertisement and marketing campaigns through the use of GA&#8217;s analysis of where the visitors came from, how long they stayed on the website and their geographical position.</span></p>
<p><span style="font-size:8pt;">Users can define and track <a title="Conversion (marketing)" href="http://en.wikipedia.org/wiki/Conversion_%28marketing%29">conversions</a>, or goals. Goals might include sales, lead generation, viewing a specific page, or downloading a particular file. By using this tool, marketers can determine which ads are performing, and which are not, as well as find unexpected sources of quality visitors.</span></p>
<p><span style="font-size:8pt;">Google&#8217;s service was modeled upon <a title="Urchin Software Corporation" href="http://en.wikipedia.org/wiki/Urchin_Software_Corporation">Urchin Software Corporation</a>&#8216;s analytics system, <a title="Urchin (software)" href="http://en.wikipedia.org/wiki/Urchin_%28software%29">Urchin on Demand</a> (Google acquired Urchin Software Corp. in <a title="April 2005" href="http://en.wikipedia.org/wiki/April_2005">April 2005</a>). Google still sells the standalone installable Urchin software through a network of <a title="Value-added reseller" href="http://en.wikipedia.org/wiki/Value-added_reseller">value-added resellers</a>; Urchin customers complained that support for and development of the standalone product languished after the Google acquisition, although a new release entered beta testing in October 2007<sup><a href="http://en.wikipedia.org/wiki/Google_Analytics#cite_note-0#cite_note-0">[1]</a></sup>. The system also brings ideas from Adaptive Path, whose product, Measure Map, was acquired and renamed to Google Analytics in 2006.</span></p>
<p> </p>
<br /><img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/b000b000b000.wordpress.com/3/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/b000b000b000.wordpress.com/3/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/b000b000b000.wordpress.com/3/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/b000b000b000.wordpress.com/3/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/b000b000b000.wordpress.com/3/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/b000b000b000.wordpress.com/3/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/b000b000b000.wordpress.com/3/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/b000b000b000.wordpress.com/3/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/b000b000b000.wordpress.com/3/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/b000b000b000.wordpress.com/3/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/b000b000b000.wordpress.com/3/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/b000b000b000.wordpress.com/3/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/b000b000b000.wordpress.com/3/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/b000b000b000.wordpress.com/3/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/b000b000b000.wordpress.com/3/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/b000b000b000.wordpress.com/3/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=b000b000b000.wordpress.com&amp;blog=3737858&amp;post=3&amp;subd=b000b000b000&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://b000b000b000.wordpress.com/2008/05/15/what-the-hell-is-google-analytics/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/1bea80cb051e17ded449d47bc57533fc?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">a000a000a000</media:title>
		</media:content>

		<media:content url="http://farm1.static.flickr.com/197/503530844_bd2565de8c.jpg?v=0" medium="image">
			<media:title type="html">google analytics logo</media:title>
		</media:content>

		<media:content url="http://www.noscope.com/media/analytics.jpg" medium="image">
			<media:title type="html">google analytics logo</media:title>
		</media:content>
	</item>
	</channel>
</rss>
