ITPub博客

首页 > 大数据 > Hadoop > 从“简单”的封面设计剖析Netflix背后的大数据哲学

从“简单”的封面设计剖析Netflix背后的大数据哲学

Hadoop 作者:bank0901 时间:2014-03-15 10:50:16 0 删除 编辑

Netflix被连续五次评为客户最满意的网站,重视客户和应用数据分析用户的习惯已深入企业文化,其先进的数据可视化技术使复杂而庞大的数据变得易于理解、易于分析、易于处理,Netflix形成了一套自己的数据哲学,仅仅是电视剧封面颜色的选择,都运用了强大的数据宝库。从公司高管到普通职员,重视数据的程度让无数公司汗颜,作者Phil Simon是WIRED的技术专家,为我们带来了详细的分析。

像Netflix这样以数据驱动业务的公司,数据可视化发挥着关键的作用,而且数据可视化也很有必要。对于数据可视化,有如下两种定义:广义上讲,数据可视化表示数据通过视觉方式呈现的过程,通常还包含一些互动;狭义上讲,数据可视化表示将数据进行抽象,提取出有价值的信息,并通过一些示意图呈现出来的过程。总之,当代数据可视化技术都可以被纳入所谓的大数据技术。

重视数据可视化

从Netflix公司的博客可以看出其非常重视数据可视化,Netflix主系统的许多部分都包含数据可视化组件,而且,像其他视觉组织一样,Netflix使用数据可视化工具已经形成了一种习惯。Netflix公司的员工会定期关注新出现的数据可视化工具,并调整算法,获得新的见解,解决紧迫的业务问题。

Jeff Magnusson是该公司数据平台架构部门的经理。2013年6月27日,在Hadoop峰会上,他为我们展示了Netflix大数据时代下不为人知的一面,给我留下了深刻的印象。Magnusson展示的数据易于理解、易于挖掘,每个人都能很容易的对数据进行处理。Charles Smith,软件工程师,也是Magnusson的同事。那次演讲的题目很有意思,叫做“有了Netflix的Hadoop工具包,猪也能飞起来”。在他们的演示中,Magnusson和Smith提到了Netflix数据哲学的三大原则:

  • 无论是大数据集还是小数据集,都要能直观显示,使其更容易解释。
  • 数据查找的时间越长,数据就变得越没有价值。
  • 第三条还是:数据查找的时间越长,数据就变得越没有价值。

Netflix的核心竞争力在于拥有最先进的大数据工具,包括数据可视化应用。这些先进的分析工具满足了两大关键团体:客户和专业技术人员,这一点很重要,而且,满足客户和专业技术人员后,最终将会使每个人都受益,无论是高管、股东、非技术雇员还是其他人。

一切让数据做主

可以对比一下《纸牌屋》和2010年版《麦克白》的封面。

第一眼看上去,它们惊人的相似。两者都显示了手上沾染鲜血的老年白人——Kevin Spacey和Patrick Stewart,与黑色背景对比得非常鲜明。图3.1进行了详细色彩对比分析:

图3.1表明了一个显而易见的事实:两个节目的封面有很多相同的地方。同时,也有细微的差别存在——而且Netflix可以精确地量化这些差异。更重要的是,Netflix可以了解这些对用户的观看习惯、影片推荐、评级之类是否存在明显的影响。

图3.2显示《纸牌屋》、《发展受阻》、《铁杉树丛》(一部美国惊悚恐怖片,于2013年4月19日首映)三者的颜色对比分析。

鉴于高质量原创电视剧内容的高昂成本(传闻《纸牌屋》制作费高达7800万美元),Netflix会草率地选择一个封面吗?决策者会忘记挖掘一下公司的数据宝库吗?用户已经有无数种选择了,难道Netflix仅仅是为了替用户再增添一个选择? 答案是:NO。Netflix没有邀请外人参加《铁杉树丛》和《纸牌屋》的制作会议,毕竟,Netflix公司拥有的数据足以使其做出最明智的决定,我打赌高管们在选择这部电视剧的封面时,一定仔细参考了订阅服务器的数据。

分析客户、了解客户,从而掌握客户

在Netflix,比较类似照片的色调不是某个无聊的雇员进行一次性试验,它已经成为选择封面的一个必要环节。Netflix公司认识到这些实验的成果有巨大的潜在价值。为此,该公司专门创建了挖掘这种价值的工具。在Hadoop峰会上,Magnusson和Smith告诉我们数据分析为标题、颜色和封面的选择提供了很多帮助。分析颜色可以使公司了解客户与客户之间的差距,甚至能分析出客户心情的变化。

有多少组织能对其客户了解到这种程度?我猜很少,大多数公司都想了解它们的客户,但能做到Netflix公司的一半就很不错了。

这回避了一个显而易见的问题——为什么要分析客户数据?通过大数据和可视化,分析客户数据,使Netflix可以无缝地为每个客户提供令人难以置信的个性化定制服务,同时,Netflix还可以很容易地整合有关客户的数据,包括影片风格、观看习惯、趋势以及其他一些数据。有了这些数据,Netflix可以尝试解决大多数组织不能解决甚至想不到的一些问题。就颜色和封面而言,这些问题包括:

  • 有客户喜欢某种特定的封面吗?如果有,那就应该做出改变为用户提供个性化的推荐。
  • 哪些标题颜色会吸引客户?
  • 对电视剧来说,这个封面是最理想的选择吗?或者是否为不同的客户使用不同的颜色?
  • 当然,还有更多问题……

让数据分析融入企业文化

简而言之,Netflix通过数据分析可以解决很多的问题,基于高质量数据和可视化工具可以做出更好的业务决策,最关键的是它让重视数据和重视数据可视化成为一种企业文化。

英语原文:

In a data-driven environment like Netflix, data visualization plays a key role. It must. In The Visual Organization, I offer the following definition of data visualization. Dataviz signifies the practice of representing data through visual and often interactive means. An individual dataviz represents information after it been abstracted in some schematic form. Finally, contemporary data visualization technologies are capable of incorporating what we now call Big Data.

According to its corporate blog, Netflix considers data visualization to be of paramount importance. Many of Netflix’s major systems contain significant dataviz components. And, like other Visual Organizations covered in this section, Netflix uses data-visualization tools on a continuous basis, not occasionally. That is, Netflix employees routinely look to existing dataviz tools to tweak algorithms, garner new insights, and solve pressing business issues.

Jeff Magnusson serves as the manager of data platform architecture at the company. On June 27, 2013, at the Hadoop Summit, he provided a rare window into the Netflix Big Data ethos. Magnusson presented with data should be accessible, easy to discover, and easy to process for everyone. Charles Smith, a colleague and a software engineer. The title of the talk: “Watching Pigs Fly with the Netflix Hadoop Toolkit.” During their presentation, Magnusson and Smith laid out three key tenets of the Netflix data philosophy:

Data should be accessible, easy to discover, and easy to process for everyone.
Whether your dataset is large or small, being able to visualize it makes it easier to explain.
The longer you take to find the data, the less valuable it becomes.
These canons explain why Netflix is the quintessential Visual Organization. At the heart of its business lie some of the most sophisticated Big Data tools on the planet, including no shortage of dataviz applications. At a high level, these tools serve the interests of two critical constituencies: customers and technical professionals. It’s important to note, however, that satisfying both masters ultimately benefits everyone: executives, stockholders, nontechnical employees, and others.

Customer Insights

Look at the covers of House of Cards and the 2010 version of Macbeth that ran on the PBS series Great Performances.

At first glance, they are eerily similar. They both display older white men with blood on their hands—Kevin Spacey and Patrick Stewart, respectively—against primarily black backgrounds. Figure 3.1 illustrates the detailed color breakdown:

Figure 3.1 manifests the obvious: the covers of the two shows are much more similar than dissimilar. At the same time, though, subtle differences exist—and Netflix can precisely quantify those differences. What’s more, Netflix can see if they have any discernible impact on subscriber viewing habits, recommendations, ratings, and the like.

Figure 3.2 shows a similar color analysis of the House of Cards, Arrested Development, and Hemlock Grove, an American horror thriller and Netflix origi­nal program that premiered on April 19, 2013.

Given the cost of producing high-quality original content, why would Netflix create the cover for a new series in a vacuum? Why wouldn’t decision-makers look at the company’s vast trove of data? With subscribers bombarded by nearly unlimited options, why leave such a potentially critical aspect completely to chance? After all, Netflix possesses the data to make the most informed business decision possible. No, Netflix didn’t invite outsiders to pro­duction meetings for Hemlock Grove and House of Cards. Still, you can bet that its head honchos carefully reviewed subscriber data when selecting the covers to these series.

At Netflix, comparing the hues of similar pictures isn’t a one-time experi­ment conducted by an employee with far too much time on his hands. It’s a regular occurrence. Netflix recognizes that there is tremendous potential value in these discoveries. To that end, the company has created the tools to unlock that value. At the Hadoop Summit, Magnusson and Smith talked about how data on titles, colors, and covers helps Netflix in many ways. For one, analyz­ing colors allows the company to measure the distance between customers. It can also determine, in Smith’s words, the “average color of titles for each customer in a 216-degree vector over the last N days.”

In a word, wow.

How many organizations understand their customers to this extent? I would hazard to guess that few do. Most companies would love to know even half as much about their customers as Netflix does.

This begs the obvious question, how? Through Big Data and dataviz, Netflix seamlessly delivers mind-boggling personalization to each customer. At the same time, Netflix can easily aggregate data about customers, genres, viewing habits, trends, and just about anything else. Equipped with this data, Netflix can attempt to answer questions that most organizations can’t or won’t even ask. With respect to color and covers, these include the following:

Are certain customers trending toward specific types of covers? If so, should personalized recommendations automatically change?
Which title colors appeal to which customers?
Is there an ideal cover for an original series? Or should different colors be used for different audiences?
And plenty more.
Simon Says: Think Visually

In short, Visual Organizations like Netflix can ask better questions and make better business deci­sions based upon superior data, dataviz tools, and a culture that recognizes the importance of both.

【敬请添加微信公众帐号:数据堂 微信号:datatang】

<!-- 正文结束 -->

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/22567304/viewspace-1119821/,如需转载,请注明出处,否则将追究法律责任。

上一篇: 没有了~
下一篇: 没有了~
请登录后发表评论 登录
全部评论

注册时间:2009-09-20