Today's new york times article

Web Content by and for the Masses

[@more@]


Published: June 29, 2005

SAN FRANCISCO, June 28 – When Caterina Fake arrives at
the end of a plane flight, she snaps a photo of the baggage carousel
with her camera phone to assure her mother, who views the photo on a
Web page minutes later, that she has traveled safely.

Skip to next paragraph



Peter DaSilva for The New York Times

Sharing will define the next phase of the Web, said Jeff Weiner of
Yahoo, pictured at rear with David Ku, front, and Eckart Walther.

And if every picture tells a
story, that may be only the start. At Flickr, the popular Web
photo-sharing service where Ms. Fake, a co-founder, posted the photo,
it can be tagged with geographic coordinates for use in a photographic
map, or become part of a communal database of images that can be
searched for certain colors or characteristics.

Flickr, acquired this year by Yahoo,
is just one example of a rapidly growing array of Web services all
seeking to exploit the Internet’s power to bring people together.

From
photo- and calendar-sharing services to "citizen journalist" sites and
annotated satellite images, the Internet is morphing yet again. A
remarkable array of software systems makes it simple to share anything
instantly, and sometimes enhance it along the way.

Inexpensive to
create and worldwide in reach, the new Internet services are having an
impact far beyond the file sharing at issue in the Supreme Court’s
decision on Monday, which focused on copyright violations using
peer-to-peer software.

Indeed, the abundance of user-generated
content – which includes online games, desktop video and citizen
journalism sites – is reshaping the debate over file sharing. Many
Internet industry executives think it poses a new kind of threat to
Hollywood, the recording industry and other purveyors of proprietary
content: not piracy of their work, but a compelling alternative.

The
new services offer a bottom-up creative process that is shifting the
flow of information away from a one-way broadcast or publishing model,
giving rise to a wave of new business ventures and touching off a
scramble by media and technology companies to respond.

"Sharing
will be everywhere," said Jeff Weiner, a Yahoo senior vice president in
charge of the company’s search services. "It’s the next chapter of the World Wide Web."

In its race to catch up with the search-engine leader Google,
Yahoo is turning to just such a shared resource: the wisdom of friends
and business associates. On Tuesday, Yahoo introduced My Web 2.0, a new
version of the company’s search engine that will harness the collective
power of small groups of Web surfers to improve the quality of search
results.

The service, which the company’s executives refer to as
a "social search engine," is based on a new page-ranking technology
that Yahoo has named MyRank. Rather than relying on which pages are
linked to most frequently on the Web – the so-called Page Rank
technology pioneered by Google – MyRank organizes pages based on how
closely search users are related to one another in their social network
and on their reputation for turning up helpful information.

My
Web 2.0 allows Web pages found useful by one member of a group to be
instantly accessible to a network of trusted associates and to their
network contacts as well. The service, Yahoo executives hope, will
combat the growing problem of search-engine manipulation by using a
collection of human eyes and minds to sort the wheat from the chaff.

Yahoo is not alone in looking for ways to take advantage of digital content created at the grass roots. This month, Microsoft
said it would add a content-subscription feature known as R.S.S., or
Really Simple Syndication, to its software in an effort to take
advantage of the explosion of user-created material. Apple Computer began offering a similar feature in the newest version of its Macintosh operating systems earlier this year.

"We are now entering the participation age," Jonathan I. Schwartz, the president and chief operating officer of Sun Microsystems,
said on Monday at an industry conference in San Francisco. "The really
interesting thing about the network today is that individuals are
starting to participate. The endpoints are starting to inform the
center."

And the announcements keep coming. On Tuesday, Google
said it would make available a free version of its Google Earth
software program that permits users to view high-resolution digital
imagery of the entire planet. A feature of the service will be the
ability of user communities to annotate digital images to make them
more useful.

Other early examples include a user-created map of
London overlayed on a schematic of the city’s subway system, and a link
between Google Maps and the apartment rental and real estate listings
of Craigslist, making it easy to visualize where rentals are in
neighborhoods or entire cities.

"It’s beyond what is possible
with individual effort, but once it’s there, millions of people will
have a tremendous impact," said John Hanke, the general manager of
Google’s satellite imaging group. "We have built this common ground
that other people can leverage."

Many Internet developers think that the Internet’s new phase will
shift power away from old-line media and software companies while
rapidly bringing about an age of computerized "augmentation" by
blending the skills of tens of thousands of individuals.

"The giant brain is us," said
Peter Hirshberg, a former Apple Computer executive who recently joined
Technorati, a service based in San Francisco that indexes more than 11
million Web logs. His reference is to the 1960’s fear that computers
would emerge as omniscient artificial intelligences that would control
society. Instead, he said, the Internet is now making it possible to
exploit collective intellectual power of Internet users efficiently and
instantly.

While Hollywood studios have generally scoffed at
competition from amateurs, the most striking example of user-generated
content may come from Spore, an online game being developed by Will
Wright, the developer of the Sims series of video games.

Spore,
scheduled for release next year, will incorporate a variety of software
tools that let users "evolve" a civilization. Rather than a massively
multiplayer game, the current fashion in online role playing, it will
be a "massively single player" game.

Although they will all be
connected by the Internet, game players will not interact with one
another, but rather with the civilizations that other players have
evolved. The entertainment value will be in exploring civilizations
created by other players and interacting with characters controlled by
artificial-intelligence software.

Spore is intended to appeal to
young game players who have no interest in being entertained passively.
"We have a whole generation of kids who feel entitled to be game
designers," Mr. Wright said.

To be sure, such open collaborative
projects can fall victim to antisocial behavior. Last week, for
example, obscene postings prompted The Los Angeles Times to curtail an
experiment in collective editorial writing using a software system
called a Wiki, an Internet server program that permits users to
collaborate in the creation of Web pages.

But the Yahoo My Web
designers think they have found a way around that hazard with a system
in which individuals invite their friends and business colleagues to
join them – an approach that will create overlapping search communities
based on mutual trust.

The Yahoo My Web software makes it
possible for users to categorize or "tag" Web pages they have found, as
well as annotate them. Tagging makes it possible for groups of
independently acting computer users to create improvised classification
systems.

The My Yahoo system makes it possible to use tags to
find categories of information as well as experts on particular
subjects. The system has a feature making it possible to see whether an
associate who has found and saved a document is online and available to
be contacted through Yahoo’s instant-messaging system.

Yahoo is
organizing the collections of tags on a central server, and they create
what is being called a "folksonomy," to distinguish the classification
system from a traditional taxonomy.

Similar tagging systems are
being used by Web services like Flickr, the photo-sharing service
purchased by Yahoo; Technorati, the Web log search engine; and
del.icio.us, a service for categorizing Web pages. But Yahoo is the
first major company to adopt the approach to harness group knowledge.

Technorati’s
founder, David L. Sifry, said the company had picked up 18 million
tagged postings and more than 1.4 million unique tag names since
January. He said a new set of standards would extend tagging into areas
like reviews, calendar events and profiles of individuals.

The
development of the tagging system typifies the bubbling up of Internet
creativity. "There is a lot of innovation coming from the fringe," said
Tim O’Reilly, the chief executive of O’Reilly Media, a publishing
company based in Sebastopol, Calif.

Mr. O’Reilly, a pioneer of
the commercial Internet in the 1990’s, said he believed that new
business models would soon emerge to match the technologies. "Certain
types of proprietary content are being displaced by freely sharable
content," he said. "Yet ultimately, this is a more complex situation,
too. New ways of monetizing content are emerging." And Google, notably,
has shown the business potential in software that harnesses online
material.

For Ms. Fake of Flickr, however, the business model is still secondary. "We’re creating a culture of generosity," she said.

contradictions and failures

It has started to get hot, Shanghai. Nothing like the temperature in Beijing. But hot, still. At night, when breezes come, heat turns into relief. But night brings extra burden of useless thoughts and hopes and desires and crashes. Everything cancels out. Isn’t it nice.

Pointless mumbling.

[@more@]
con

昨天在徐家汇的Starbucks,正排队,买杯咖啡。忽然后头有人叫我,回头一看,居然是豆瓣的阿北。

也是够巧的,偌大的上海。不过我们住的地方倒是挺近。

今儿阿北过来办公室这儿坐坐。以后打球什么的倒是多了一伴。

[@more@]

Web2.0和土豆

(这周的三联生活周刊有一篇关于土豆的文章。下面贴的是这个文章出来前写的。尚进说,写一个,说说到底土豆是怎么回事。就写了。顺便就贴了,觉得土豆未来想做什么,很多人真是不知道。

理想和现实之间的,就是每天我要穿行的空隙。)

我们半年多前开始动手做土豆的时候,听都没听说过web 2.0。不过很多事都是这样,在同一时期,不谋而合的东西多了,自然就被归纳成了“运动”或者“浪潮”。在一个浪潮之中,先被做为个典型代表,当然是好事。

我们做土豆,有两点东西从一开始就很清楚。一个是,我们生活在个人的时代。另一个,这是个视觉的时代。

个人时代的例子太多了。随便看个新闻都能举出个例子。比如,5/30的路透商业新闻,说Nike刚刚推出一个网站,让购买者自己设计Nike的球鞋的样式,颜色等等。对于Nike这样一个完全以时尚和设计作为商业模式的公司,推出这个基本上就是说他们认识到,消费者个性化的设计也许不如Nike几百个专业设计师中任何一个的专业水准,但是消费者自己喜欢。不但喜欢,而且他们还愿意多付几十美元来做这么一双自己的球鞋。

个人时代我想大概可以归因于我们生活在一个社会稳定,物质极其丰裕的时代(某些区域除外。而且很不幸,目前大伙儿对它们不是视而不见,而是基本上视都不视了)。虽然有些恐怖主义和海啸之类的,但是和过往的任何时候比较,这些都不是社会的主线。

所以土豆从一开始的设计就是“个人”。大伙儿自己做的节目,大伙儿互相分享和欣赏。

另一个,视觉的时代。这个我想大伙儿都看到了,视觉的追求已经渗透到社会的每个角落。文字的力量当然还在,但是我们已经到了连社会革命都需要用颜色来表达的时代。看看最近的这些橙色革命,紫色革命,等等等等。

这两个当然只是个大概念,具体到我们想做的,就是我们的土豆。网站有一个好处,我们做的都在上面直接呈现出来了。而我们计划要做的,基本上也是在这个脉络上,可以顺藤摸瓜,很清楚。

第一,我们的用户必须得到好处。比如,这些好处可以是提供节目的播客们发布节目上的便利和各种他们可以使用的宣传手段。此外,土豆将来可能的商业收益,都是单个节目提供者在获得自己的个人收益之后,土豆获得其中的一小部分。而对于单纯的节目观众,他们在使用上的体验就是他们获得的好处,我们每天都在仔细推敲观众所可能走过的每一步,所可能有的感觉。

第二,网站的基础必须是一个由机器运算的算法。这个可以用刚才提到的我们生活的这个社会来类比。因为我们生活的社会有了一套相对完善的体系和维系社会稳定繁荣的运作规则,才有了这个体系上个人个性化追求的可能。想像一下,在刚果,会有哪个小孩见个人就说,我要个性,我要设计自己的IPOD,Nike球鞋,个性化的手机铃声,追求自我价值的实现?他每天考虑的是怎么吃饱,怎么别被爱滋病传染,怎么躲过下一场饥荒。

所以土豆的设计,是隐藏在个性化下面,一套完整稳定的机器算法基础。

第三,大家见了就要说wow的用户界面。

机器算法就象是个楼房的钢筋水泥。你知道有了这个好的结构,很稳,放心。但是谁都不想天天看着这个钢筋水泥的架子。大家能看到的,想看到的,是外型。Frank Gehry的每一个设计,底层的结构都得仔细推敲过,才能支撑外型设计上的突破。但是我们wow的,是他的外型。

土豆的用户界面,现在大伙儿都觉得很不错。而我们正在做的这个项目,几天后完成,会让大家一看到就说,Wow。底层的算法,大伙儿就不用太了解了。

在使用上,我的脑子里,有土豆上每一个链接,用户每一个鼠标按钮点击后会出现的每一个图像。用户所走的每一步,我们都仔细推敲过,而且还得不断改善。

用户的界面同时也在不断地演化中。很简单的原因,土豆是一个活的生物。每一天,在我们设定的算法基础上,网站都在进化演变中。这是一个很神奇的过程,也许就象是有孩子的人看着自己孩子一天天长大的感觉。

上面的这三条是土豆构建的基础。

土豆的商业模式,其实很简单,也是几个月前还在计划阶段都仔细考虑过了的。现在不想细说,因为还在一步步地按照计划在走。

但是简单地说,我相信在目前几个内容产生的环境中,互联网无论如何都是最有创造力,最有活力的。所以,我们关心的节目源头,内容源头,不是这些管它是宽屏还是高清,液晶还是等离子上一看就烦的节目,是互联网,是这上面最有创造力,最有活力的这一个个的人。 土豆上的节目和这些看了就烦但是依然有受众的电视节目,对很多用户来说,互为补充。而土豆节目的出口,是互联网本身,也是互联网以外的传统渠道。一个多对多的,个性化的形式。
 
技术上来说,这样的模式要求土豆是一个尽量开放的地方。而且,土豆所有的底层都构建在开源组件的基础上。饮水思源。

目前为止,大伙儿对于土豆本身的构建逻辑和计划一般都很认同,但是大伙儿也都一致地会问:政策性风险怎么办?

从技术和流程手段上,控制内容的发布是很容易的事。必要的执照和许可,我们在申请中。这些是操作的手段,容易。不过谁都也知道,在中国,政府可以轻松击垮任何一个自以为强大的公司。

但是,对于中国,自己的国家,我的信心很大。

前些天,Thomas Friedman,纽约时报著名专研国际外交的专栏作家,出版了本书,”The World is Flat, a Brief History of the Twenty-first Century”。“世界是平的,21世纪的简明历史”,书名。

书的主旨是,因为技术的原因,尤其是互联网时代下宽带以及这无数的海底光缆的渗透,我们生活在一个互相关联,即时沟通的世界。对于这个网络上的人,地球不是圆的,是平的。全球化的3.0版本,这个过程不会由公司主导,而是由个人主导。连接在这个网络上的每个人,不管他是在美国,欧洲,中国还是印度,谁都有机会在这个网络上有近乎平等的机会竞争,而且,有机会胜出。

他的观察和设想,对于土豆来说,非常切身。土豆,是这个应用在这个平的世界上到目前为止最好也是最大规模的,无论是在美国还是在欧洲。在中国的上海,我们知道周围有不少的公司和个人,在设想的和在做的东西,在世界上任何一个地方都有竞争力而且可能胜出。

举个很多人没有意识到的例子。虽然现在名声恶劣,但是短信和手机铃声的商业应用模式,是在亚洲,尤其是在中国得到普及后扩展到欧洲和美国。其中比较好玩的例子,是以前公司的一个同事,美国人,他在麦肯锡时候最好的朋友,辞职在美国做了个手机铃声下载的网站,完全照搬亚洲模式。所有人都说,不可能成功。他们大概都觉得从来都是美国输出商业模式和先进科技,不可能倒着来。

前几个月,他的公司被Verizon收购,4000万美元。

我不是未来学家,也不是Thomas Friedman,将来怎样我不
知道。但是如果web 2.0真的来了,我相信,在这个平的世界上许多地方都可能浮现杀手级的应用。而在宽带如此普及的中国,如此巨大统一的互联网平台而且没有欧洲的语言问题,出现而且胜出的公司,一定有世界竞争力。

Thomas Friedman很欣赏中国的政府能力,说政府的高级官员是在他世界各地见过的最有能力的官员。他举了很多例子,经济的20年持续增长,市场的规划和引导,等等。

如果是聪明的政府,一定会鼓励和引导在中国的土地上可能出现的将来的世界级公司,而不是打击。

无论未来怎样,这个巨大的游戏一定很刺激,很好玩。而且,在世界各个角落的几十亿人,人人都有机会参与这个游戏。这个时代的the Great Game。 
 [@more@]

Blink得睡着了

Malcom Galdwell出的本新书,Blink,说的是很多时候一个人最初两秒钟的决定比仔细分析后的决定有效。

书是Michael昨天过来,带给我的,他带这本书给我,原因简单,因为去年我把Gladwell的成名作Tipping Point给他了,他觉得不错,有启发。在机场看到,立刻就买了。这个行为基本上先验证了一点,就是一个人经常在两秒钟里做决定。不过,同时也验证了,两秒钟里做的决定经常都是错的。

这本书说,人经常做出快速的判断,这些判断经常都是对的。OK。这些判断有些时候也会出错。OK。如果一个人经验越多,阅历越丰厚,快速的判断会更准。OK。

说的都对,可惜这些东西谁都知道,只不过大多数人没有象Gladwell这么好的写作本领,可以列出不少有意思的故事描述上面说的这几点。

所以这书的基本意义也就是这样,几个有意思的故事。至于观点本身,呃,不知道他的观点究竟是什么。

Tipping Point好,因为他把统计学和心理学一些东西结合一块后,应用到大伙儿周边熟悉的事件上。他想说明的问题,所谓的tipping point,是数学上完全没有问题的。而他选的故事也把这个数学问题解释得很恰当。

Blink,眨眼,这本书翻译成中文的话,看得我困得,闭上了眼,在上海梅雨将来的闷热下午,睡了个难得的午觉。

[@more@]

Ajax和Flash一起上

说起来就是这么好玩儿,前一阵子痛扁了一下Ajax,主因是我觉得Ajax简直就是个纯粹的噱头,拿几个技术一拼凑,就楞成了个平台。

昨天Charles说,“恩,我们用的这个Ajax…”他话没说完,我就知道坏菜了。这个已经传染到我们的Charles这儿了。起一个让人一听就记得住的名字,就是有好处。

他说的我们用的Ajax这个东西,就是我们的节目页面。Javascript加上单向数据,可不,就是A和J。我们的系统是Apache的,开放的API是XML的。这个Ajax四个字母,一个没跑。

完蛋了就完蛋了。反正我不管用的什么,只要做出来的是我要的,技术用的是我要的,叫什么名字就什么名字吧。

哪吒是Flash,哪吒的另一面,节目的总体页面,是Ajax。土豆可真是什么都不缺啊。NND.

[@more@]

大伙儿关于哪吒的几个问题

内测了一天,也收到了些问题。在这儿写了,就不用一一回答。

哪吒为谁服务?
观众/听众。更明确一些,是一个随机的观众/听众。这个观众对于节目和播客基本不熟悉,而且,他/她很懒。就象我。

对于播客,哪吒的作用是间接的。 土豆上的所有节目都是公开的。播客发布了节目,通常情况下,都很喜欢获得观众的认可。如果一个播客希望有更多的观众,获得更大的认同, 哪吒的间接作用就很大。播客只要遵循哪吒的算法原则,一个公平的机器原则,就可以获得更高的排名,也就获得更多的观众。

哪吒的算法为什么只用“下载“,而不考虑其他很多有影响的因素?
越是elegant的算法,越简单。而且,基本算法简单,才会实用。很多别的因素有影响,但是不属于基本算法。而且,很多因素都是red herrings。时间推移,在基本算法的基础上,自然会有调整。时间长了,节目多了,哪吒的作用就体现出来了。

哪吒不是RSS,不是Podcast的Enclosure啊?
podcast的enclosure这么简单的一个字段的东西,怎么可能成为一个项目。Podcast的功能当然会有,再过些天。现在土豆上唯一不能算完全podcast的就是这个直接下载节目的enclosure字节。而我们一直没有提供的唯一原因,就是土豆要做任何事,都要仔细考虑过,而且,做就要做的让大伙儿说,Wow。

开源?
土豆的哪吒完全基于XML。公测后一段时间,将会开放API。我们的节目上传流程也是一样的XML流程,将会同时开放API。

[@more@]

the NY times; Which camera does this Pro use?

"Ultimately, the technology is just a tool," he said. "It’s a tool that
lets your eye become the picture. It’s easy to get caught up with all
of the gadgets and all of the technology, but the most important thing
is just to get comfortable with the tools you have."

http://www.nytimes.com/2005/06/08/technology/circuits/08schiesel.html?pagewanted=1
[@more@]

哪吒之三:基础算法

下午就把哪吒的内部测试版扔出去给大伙儿试验试验了。稍有点时间,再继续写我的哪吒算法。很简单的一个描述。明白的,自然就明白了。

接上一篇:

互联网上的用户 = 土豆的用户
关键词     = 标签
网页    = 土豆上节目
网页的链接   = 用户花费的资源成本 = 土豆上可被机器计算的数值 (?)
网页群体的关系 = 土豆上的?

网页群体的关系比较简单:

标签彼此之间的关系,没有什么意义。标签和用户之间,可以通过节目文件,建立起一个中间过渡的多对多关系。

更重要的问题:在土豆网上,一个普通用户花费的成本,是什么?

在流量主导几乎所有人讨论方向的互联网世界,估计所有人脑子里闪过的都是:浏览量。

表面上看,这似乎是最公正,最直接,最合理的计算方法。但是,就象很多的表象一样,表象只是表象。

再回顾一下goolge的评估网页的方法,对比早期的网页评估方法。早期的,也是很直接地就评估网页或者网站的流量。似乎是最合理的评估方法。流量最高的,当然是最有价值的。Google之后,我们知道,这其实并不是最合理的。流量太容易被操纵,有太多的流量都是垃圾流量,不带有任何意义的流量。

谁都知道,mindless mob的力量非常大。但是,mob所选定的方向,长期而言,不具太大的价值。要不怎么叫mindless mob,怎么叫被操纵的mob。

回到前面的结论:只有一个用户花费了可衡量的资源,他的投票才有意义。

在土豆上,用户有意义,需要耗费的资源,就是他的带宽。我们计算的,因此是:
1.一个节目文件被多少个用户下载了。
2.每个用户每天下载了多少个节目。

这个分析的基础,是因为土豆的节目文件都比较大,动辄数十兆。一个用户,如果他在在线预览了一个节目后,居然还愿意花时间和带宽,下载一个文件到自己的硬盘上,这一票,就是远比浏览重要得多的一票。

同时,如果一个用户每天只下载一个节目,相对于一个每天下载几十个节目的用户,那他的所赋予这个节目的价值就比第二个人更高一些。

这样,我们的算法就可以用一个简单的公式(好奇的,这个公式和前面的一和二就是土豆申请的全球专利。还有后面的土豆哪咤的实际运用):

The method of ranking any individual file with multiple accessing devices is thus:

VR(A) = (VD(d1)/C(d1) + … + VD(dn)/C(dn))

Where:
VR is the value of the file.
VD is the value that unique accessing device assigns to the file
C is the number of assignments each accessing device conducts over a set period of time

这是表面上非常简单的算法。

同时,非常重要的,土豆是一个内部可以完全控制的网络,不象google,需要在一个浩瀚的互联网大洋中寻的秩序。土豆现在其实还是个小湖泊。但是,照着目前的增长速度,很快,我们的节目和用户,就会有一个滔滔东海的感觉了。接下来,衡量的方法,很容易可以调整。

哪吒是为了这个未来的东海做准备。

今天下午就开始测我们的哪吒。目前还是内测,拿到内测口令的,大伙儿多提意见。

http://www.toodou.com/tag_alpha.php

[@more@]