Gmail Footprint

Gmail Footprint is a personalize live analysis of a gmail account. offering hidden meta information and relationships patterns visualization, with the aim to expose the nature of contacts through the use of language.

Visit the gmail footprint site to see the exploration.

Gmail Footprint started as a data visualization of a nine years old mail box, exploring the relationships between the most common words, topics and tokens and how they are distributed.

Around ten years ago I started speaking English as my main language, With a limited vocabulary and positive encouragement I decided to give it my best try. A year after,  while I was still at the very beginning without any connection to my progress, I also opened my first gmail account, which has been my personal email account since. Playing around with the google Oauth api I decided to do some data mining and see what I was actually saying vs what I been told.

At first I looked  more general data, of what was the impact from all the emails I sent and received, I filtered first the emails that I sent, and assembled a contact list based on the receptionist, then I went and downloaded all the emails I received from the people I actually contacted, this was an easy spam filter. Then with a few python tricks I tokenized and sorted the data, here is some of the interesting picks visualized with D3, as you’ll see I started with looking at the total data, but I found that it was actually more interesting to explore more specific contacts


The contact list is a bit longer, but not by much, I was surprised to see that I’ve only been in touch with 662 people throughout the years, and more so only had an exchange of more then 24 emails with less then 24 of them.