I like to build short presentations. Talking at a group for even half an hour non-stop seems like too much of a one-way street. My hope is that people come not for the lecture style format, but to start a dialog about something that I may have seen or done, or discuss ideas they have that they would like to share. Often I feel that one can get lost in the overview, or dig too deep into topics that people don’t care about. Getting to the Q&A helps target the talk to exactly what the audience is interested in and allow things to get straight to the interesting tidbits. Knowing if the audience wants to know the tech stack, see data-viz, or just cares about building teams is a very difficult trick; this is my shortcut to targeting the message properly.
That being said, there were lots of very interesting questions! I wanted to give an overview and potentially dig a bit deeper into what I found people are most interested in right now.
1) Momentum is increasing in the movement of quantified-self. How can I take all these devices and applications that are collecting data, Fitbits, sensors, iPhone apps, combine and correlate the data, and make conclusions about why that headache happened, how much should you be exercising, and just generally looking back at your history. The number of sensors is dramatically increasing and now the data scientists have taken notice; analyzing these data and creating mashups and utilities for users is going to be a big thing in the next few years.
2) How are data teams being built? I have personally been hearing this question becoming more prevalent in the last year. There are two sides — how do I become a data scientist, and how do I build a team of data scientists. I touched on this a bit in my slides, but I feel there is a lot more to be said. Building any team is hard, and since this is a relatively new field, there is the additional challenge of first defining the team you want.
3) R is coming back in a big way. Seeing an open source solution like R develop as a platform with a rich set of libraries and tools and new methods to parallelize and distribute scientific computations is really fun. Once things mature, I can see this dramatically cutting down the time from concept to product.
4) There are a *lot* of folks building analytics platforms. There is an obvious gap in the market for doing analytics on big data. I’m glad folks are finally building tools to explore and query large datasets interactively.
I’m not a huge social media user (ironic, I know), but I realized that there is a pretty substantial conversation happening on the internet around startups, data, and New York City. Although some of this lives in the ephemeral land of twitter, a lot of it is happening through blogs, especially as a way to anchor and start new conversations. I hope to use this as a vehicle to join in that conversation and give my two cents, for whatever they are worth.
I have been seeing a massive surge of activity in all of the areas that intersect my life and what I do for work. The startup scene in New York is exploding, data science and engineering is becoming a huge trend, and as a result there are tons of people getting involved in this quickly developing field. Nurturing that growing environment and helping shape the field as it becomes a dominant force are a few of the things that I hope to achieve with this side of the conversation.
Keep the dialog open, hopefully I will learn how to post things that are the most useful to anyone who is also interested in these topics. Thanks for reading!