I’ve often said that “data science” is the new “plastics,” hearkening back to that famous scene in The Graduate where a neighbor gives cryptic one-word career advice to the young graduate Benjamin Braddock, portrayed by Dustin Hoffman.
I’ve told my own son data science numerous times as well. (Yes, that’s to the one in college, not grade school, but I suppose it’s never too early to start.)
The other day, I found this great post on the subject from Zipfian Academy and I not only tweeted it on the spot, but wanted to blog about it here.
Here’s the introduction:
There are plenty of articles and discussions on the web about what data science is, what qualities define a data scientist, how to nurture them, and how you should position yourself to be a competitive applicant. There are far fewer resources out there about the steps to take in order to obtain the skills necessary to practice this elusive discipline. Here we will provide a collection of freely accessible materials and content to jumpstart your understanding of the theory and tools of Data Science.
Big Data, Cloud Computing and Industry Perspectives with Dave Kellogg
BY Darren Cunningham
I had the pleasure of working with Dave Kellogg early in my marketing career and continue to learn from him as a regular subscriber to his popular blog, Kellblog. A seasoned Silicon Valley executive, Dave has been a board member (Aster Data), CEO (MarkLogic), CMO (Business Objects) and VP of Marketing (Versant and Ingres). I recently sat down with Dave to discuss industry trends. As always, he didn’t hold back.
Dave, you’ve written a lot about “Big Data” on your blog. Why is it such a hot topic in the world of data management?
First I think Big Data is a hot topic because it represents the first time in about 30 years that people are rethinking databases. Literally, since about 1980 people haven’t had to think much about databases. If you were an SMB, you went SQL server; if you were enterprise, you’d go Oracle or IBM depending on your enterprise preferences. But in terms of technology, to paraphrase Henry Ford: any color you want, as long it’s relational.
Overall, I think Big Data is hot for three reasons:
Major new innovation is finally happening with databases for the first time in three decades.
Hardware architectures have changed — people want to scale horizontally like Google.
We are experiencing a serious explosion in the amount of data people are analyzing and managing. Machine-generated data, the exhaust of the Web, is driving a lot of it.
I think Big Data is challenging on many fronts from the cool (e.g., analytics and query optimization), to the practical (e.g., horizontal scaling), to the mundane (e.g., backup and recovery).
What’s the intersection with Cloud Computing?
I think when people say cloud computing, they mean one of several things:
SaaS: The use of software applications or platforms as services.
Dynamic scaling: My favorite example of this is Britain’s Got Talent, which uses Cassandra. Most of the time they have nothing to do. Then one night half the country is trying to vote for their favorite contestants.
Service orientation: The ability to weave together applications by calling various cloud services — in effect using a series of cloud services as a platform on which to build applications.
I think Big Data intersects with cloud in several ways. First, the people running cloud services are dealing with Big Data problems. They are hosting thousands of customers’ databases and generating log records from hundreds of thousands of users. I also think Big Data analytics are very dynamic loads. One minute you want nothing, then suddenly you need to throw 100 servers at a complex problem for several hours.
How do you see these trends changing the role of IT?
I think corporate IT is constantly evolving because smart corporations want their internal resources focused on activities that they can’t buy elsewhere and that generate competitive advantage for the business.
IT used to buy and run computers. Then they used to build and run applications. Then they focused on weaving together packaged applications. Going forward, they will focus on tightly integrating cloud-based services. They will also continue to focus on company-proprietary analytics used to gain competitive advantage.
The other trend driving IT is consumerization. The Web sets expectations for functionality, user interface and quality that corporate IT must meet with internal systems. The bar has gone way up – people won’t tolerate old-school ERP-style interfaces at work when they’re used to Facebook or Yelp.
What does that mean for technology sales and marketing?
If Mr. McGuire in The Graduate were dishing out advice today, instead of saying “plastics,” he’d say “data science.” More and more companies will use data scientists to analyze their business and drive tactical operations. First you need to gather a whole bunch of data about your operations and customers. Then you need to throw world-class data analysts at it to get business value and to be sure you don’t draw false conclusions – e.g., mixing causality with correlation.
Today, most companies have their sales departments on salesforce.com. Leading marketing departments are on Marketo or Eloqua, but most marketers still don’t have much technology backing them. Going forward you will see a whole class of analytics applications vendors providing advanced analytics for Salesforce (e.g., Cloud9, Good Data) and the marketing automation vendors will move beyond lead incubation into providing overall marketing suites. I expect Marekto or Eloqua to try to do for the chief marketing officer what SuccessFactors did for the chief people officer – and if they don’t, then there’s a real opportunity for someone else.
Speaking of all things cloud, you often write about Silicon Valley trends. How would you characterize what’s going on in the market right now?
From my perception, the Silicon Valley innovation engine is running full out. Top VCs are raising new funds. I meet a few new startups every day. Of late, I’ve met fascinating companies in next-generation business intelligence, analytics, Big Data, social media monitoring and exploitation and Web application development. One of the more interesting things I’ve found is a VC fund dedicated to big data – IA Ventures (in New York). When I heard about them, I thought: oh, lots of Big Data infrastructure and platform technologies. Then I spent some time and realized that most of their portfolio is about exploiting new Big Data infrastructure technologies via vertical applications. That was really interesting.
People will debate whether we’re in a mini tech bubble or a social networking-specific bubble. Who knows? I just read an article in the The Wall Street Journal that argues $140B valuation for Facebook is realistic, and it was fairly convincing. So you can debate the bubble issue but you can’t debate that the IPO market has been closed for a long time. Now it is starting to open, and that’s a huge change in Silicon Valley.
What advice do you have for both entrepreneurs and IT veterans?
Don’t build or run things that you can buy or rent. If you follow that mantra, you will follow market trends, and always stay at the right stack-layer to ensure that you are adding value as opposed to leveraging old skill sets. While you may know how to run a Big Data center, you can now rent time in one more cost-effectively. So either go work for a company that runs data centers (e.g., Equinix) if that’s your pleasure, or go leverage the people who do. Put differently, don’t be static. If you’re still using skills you learned 10 years ago, make sure that you’re not teeing yourself up to get left behind.
[Notes: Minor changes made from the SandHill post. I added emphasis via bolding and I corrected the attribution of the famous lines “plastics” from The Graduate. It was not Mr. Robinson, but Mr. McGuire, who said it.]
Since I’m on the board of Aster Data I will refrain from editorial on this announcement and simply say congratulations to Teradata on buying a great company and congratulations to Aster Data, its founders Mayank Bawa, Tasso Argyros, and George Candea, its investors, and its employees on what I view as a successful win/win outcome.
I’m Dave Kellogg, advisor, director, consultant, angel investor, and blogger focused on enterprise software startups. I am an executive-in-residence (EIR) at Balderton Capital and principal of my own eponymous consulting business.
I bring an uncommon perspective to startup challenges having 10 years’ experience at each of the CEO, CMO, and independent director levels across 10+ companies ranging in size from zero to over $1B in revenues.
From 2012 to 2018, I was CEO of cloud EPM vendor Host Analytics, where we quintupled ARR while halving customer acquisition costs in a competitive market, ultimately selling the company in a private equity transaction.
Previously, I was SVP/GM of the $500M Service Cloud business at Salesforce; CEO of NoSQL database provider MarkLogic, which we grew from zero to $80M over 6 years; and CMO at Business Objects for nearly a decade as we grew from $30M to over $1B in revenues. I started my career in technical and product marketing positions at Ingres and Versant.
I love disruption, startups, and Silicon Valley and have had the pleasure of working in varied capacities with companies including Bluecore, Cyral, FloQast, GainSight, MongoDB, Recorded Future, and Tableau.
I previously sat on the boards of Granular (agtech, acquired by DuPont), Aster Data (big data, acquired by Teradata), and Nuxeo (content services, acquired by Hyland), and Profisee (MDM, exited to Pamlico).
I periodically speak to strategy and entrepreneurship classes at the Haas School of Business (UC Berkeley) and Hautes Études Commerciales de Paris (HEC).