publications image.jpg

Publications

Love, Life and R

It behooves of every single person to understand and analyze their life and every aspect of it, to be better able to capitalize on every potential opportunity that life has to offer. R is that perfect means to that end. A little bit of coding knowledge and a great deal of common sense would yield productive trysts with destiny, and a leverage to control it in your favor. These analyses may make life and the analyzer sound like machine-oriented robots, but when life throws you lemons (in the form of data), why not make lemonade (in the form of insights) out of them?

*This post was initially published in Wired.com on Nov 14, 2013*

Questions:

Love: “How would it be, if you are able to analyze your complete Whatsapp/Snapchat history — and understand how you have conversed with the love of your life and how your love grew?”

Life: “How would it be, if you could understand where you are spending most of your money and what your major expenditures are going to be for the coming year from your bank statements?”

Work: “How would it be, if you could understand from your work email and other work-related logs, the most productive time of your day and the perfect time for you to start a specific task?”

 If these questions arouse curiosity and you would want to have control over your life, then the immediate question is – How do I do it? R and Personal Analytics are your answers. R is a simple statistical tool that promises to provide any beginner and/or statistically and algorithmically challenged person with immense power to analyze their personal life. Personal Analytics deals with aspects of understanding human behavior and daily personal events in a way that can inform the user of positive actionable insights.

For long, quantified self and personal analytics have been domains relegated to the stats-savvy department. They were consistently bogged down by the complexity of computation and feasibility. In this article, we propose to blow these myths away and make Personal Analytics, quite true to its namesake, “personal”to all. We will also discuss how R features in this bigger objective and how we can start thinking along the right lines to analyze ourselves.

 Love in the Time of R

LoveinRWordcloud.png

It is beyond surprising, how so many people have quickly taken to the medium of OTT (Over The Top) messaging along with traditional SMSes, within a very short period. It is generally proclaimed that almost 41 billion messages are sent via OTT medium. One particular application that is widely used is Whatsapp and I personally had used it to start talking to the love of my life. I have always been interested in understanding this love story and I used the mot justetool for such things — R. Whatsapp allows users to download the entire chat history in a text format, at the click of a button. Data from Whatsapp and the analysis horsepower from R threw some interesting limelight on this love story.

From the word cloud of my chat history, I understood the keywords we used a lot were – Yes, Ok, Nice – indicating that we agreed a lot with each other. Other keywords indicated there was a lot of laughter, and discussion about quotidian activities and work. Although it is a simple word cloud, it gives amazing clarity into the kind of discussions/conversations.

Another interesting insight gathered from the history was the number of texts shared by the hour of the day. There’s a gradual rise in the number of texts as the day awakens, reaches a peak sometime towards the start of the office day, witnesses a big dip during the lunch hour and maintains an average, gaining traction back in the late hours of the day. Benefits of doing these? Helps understand where time is spent texting and how it is spent.

LoveinRTimespent.png

Data: Download Chat History as .csv file from Whatsapp — Select a specific conversation, tap on settings and click on the “Email Conversation” option.

Analysis: Simple wordcloud() and aggregate() functions will yield the desired result in R.

Life in the Time of R

Life is not essentially about finances and money. But it would be a little dangerous to go on living without a pulse on one’s expenses. It is necessary to have a good, if not complete, understanding of where the purse is bleeding money. With the proliferation of online banking and ability to obtain transaction summaries at a mouse click, it becomes easier to understand one’s financial history. I wanted to understand where I was spending the most, but more so, I also wanted to understand which categories I would be spending the most in the forthcoming months.

Again, voila, thanks to R and my tech-savvy bank’s willingness to provide a glimpse of my data as a simple .csv download – I was able to exactly predict where I was more prone to spend my money. And I did use that insight to smartly manage my money in those two months and continue to use R to help me manage my wallet better.

LifeinRFinances.png

The above pie-chart comparison between my previous and current year finances, talks a great deal about how I have changed as spender, post the coup de foudre mentioned above. Investment seems to take the bigger chunk of my finances (a good 25%), thanks to my dad’s solid advice. Interesting to note is the fact that while gift-giving has crept in as a significant spending segment by itself, it has happened only at the cost of less healthy and less beneficial restaurant food and parties.

I was able to run simple forecasting models in R using my previous data and understood that in the impending couple of months, I have a natural tendency to spend a lot of money on travel. Owing to year end vacation and holidays – ergo, I have been planning my travels in advance, and trying to optimize on overall travel spend.

Data: Download the online statement/bank summary from your bank’s online portal as a .csv file.

Analysis: Simple pie() and barplot() along with aggregate() functions will yield the necessary results.

Work in the Time of R

Stephen Wolfram had this amazing blog on Personal Analytics, which detailed his everyday email and call activity for a period of 20-odd years and tracked with meticulousness the most productive time of day in his life. It is pretty simple, and straightforward to do that and more using your very own code in R. Email metadata is very easily available and with some basic logic put into the coding, I did wonders to figure out the best response time for my mails and when I should be drafting a new mail, as against replying to an existing thread. No software is needed and no need to upload your precious data onto public online servers.

WorkinRMailsandMeetings.png

From the above graph, I figured out the volume of incoming and outgoing mails, along with the barrage of meetings that fill up my day. It is quite evident that I do not respond immediately to the burst of incoming mails. I choose to let them accumulate over time, and reply to all of them in a set time frame, so that I can spend time on quality replies and not respond on impulse. Nonetheless, looking at the overlap of sent messages and the meetings, I have learnt that I could probably get more systematic with my replies, so I can spend more quality time in meetings and not worry about having to take care of the backlog of emails.

Data: Microsoft Outlook has a simple Import/Export feature, which allows you to export your calendars, mailboxes into a .csv file.

Analysis: Basic aggregate() and barplot() functions will yield the necessary results.

Health in the Time of R

Having analyzed my love, life and work with R, I have strongly come to believe that Personal Analytics might be that secret to ever-lasting life and longevity.  For example, I monitor my daily running patterns, in juxtaposition with watching my weight to understand how my running affects my overall weight loss — and which route/distance is the most optimal. I also analyze my water consumption and food intake measures to see how that affects my overall health.

Thus, Personal Analytics can be put to stellar use in understanding the health of a person every single second of their life. It can also monitor the vital signs more specifically in cases of persons under distress. By understanding these vital signs and their pattern, it would be possible to pre-diagnose and/or predict any life-threatening situations and provide pre-emptive treatments at the right time. Although it sounds fantastical to be true, it would soon be a reality with the right sensors and more human-friendly hardware.

Conclusion

It also behooves of every single person to understand and analyze their life and every aspect of it, to be better able to capitalize on every potential opportunity that life has to offer. R is that perfect means to that end. A little bit of coding knowledge and a great deal of common sense would yield productive trysts with destiny, and a leverage to control it in your favor. These analyses may make life and the analyzer sound like machine-oriented robots, but when life throws you lemons (in the form of data), why not make lemonade (in the form of insights) out of them?