<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>Technically Possible</title>
	<atom:link href="http://technicallypossible.wordpress.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://technicallypossible.wordpress.com</link>
	<description>My findings, Mis-Adventures, Jokes on Tech &#38; Advertising</description>
	<lastBuildDate>Sun, 18 Dec 2011 16:01:20 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
<cloud domain='technicallypossible.wordpress.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://0.gravatar.com/blavatar/cd00a5875e3b67631dce3c7a1c9f779c?s=96&#038;d=http%3A%2F%2Fs2.wp.com%2Fi%2Fbuttonw-com.png</url>
		<title>Technically Possible</title>
		<link>http://technicallypossible.wordpress.com</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://technicallypossible.wordpress.com/osd.xml" title="Technically Possible" />
	<atom:link rel='hub' href='http://technicallypossible.wordpress.com/?pushpress=hub'/>
		<item>
		<title>What I&#8217;m Reading &#8211; 1</title>
		<link>http://technicallypossible.wordpress.com/2011/12/18/what-im-reading-1/</link>
		<comments>http://technicallypossible.wordpress.com/2011/12/18/what-im-reading-1/#comments</comments>
		<pubDate>Sun, 18 Dec 2011 16:01:15 +0000</pubDate>
		<dc:creator>Neel</dc:creator>
				<category><![CDATA[Tech]]></category>
		<category><![CDATA[Personal]]></category>
		<category><![CDATA[koyaanisqatsi]]></category>
		<category><![CDATA[keynotes]]></category>

		<guid isPermaLink="false">http://technicallypossible.wordpress.com/?p=572</guid>
		<description><![CDATA[A post about how Yahoo! uses the behavioural data got from their toolbar towards improving crawling. Its pretty amazing how many previously uncrawled pages are identified using this method. http://glinden.blogspot.com/2011/11/browsing-behavior-for-web-crawling.html . A review of Koyaanisqatsi which I watched with the GUPS. Its a very artsy documentary, showing some amazing bits of photographic excellence. http://enthusiasms.org/post/13850656021 . [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=technicallypossible.wordpress.com&amp;blog=10830423&amp;post=572&amp;subd=technicallypossible&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<ol>
<li>A post about how Yahoo! uses the behavioural data got from their toolbar towards improving crawling. Its pretty amazing how many previously uncrawled pages are identified using this method.<br />
<a href="http://glinden.blogspot.com/2011/11/browsing-behavior-for-web-crawling.html">http://glinden.blogspot.com/2011/11/browsing-behavior-for-web-crawling.html</a><br />
.</li>
<li>A review of Koyaanisqatsi which I watched with the <a href="http://guphotosoc.co.uk/">GUPS</a>. Its a very artsy documentary, showing some amazing bits of photographic excellence.<br />
<a href="http://enthusiasms.org/post/13850656021" target="_blank">http://enthusiasms.org/post/13850656021</a><br />
.</li>
<li>Watch HCIR2011 on YouTube. All the Keynotes and presentations. A very handy resource for me considering I have not yet been to  conference proper.<br />
<a href="http://thenoisychannel.com/2011/12/17/hcir-2011-now-on-youtube/" target="_blank">http://thenoisychannel.com/2011/12/17/hcir-2011-now-on-youtube/</a><br />
.</li>
<li>We need some angry nerds. An editorial of sorts from Harvard.<br />
<a href="http://www.law.harvard.edu/news/2011/11/30_zittrain-the-personal-computer-is-dead.html" target="_blank">http://www.law.harvard.edu/news/2011/11/30_zittrain-the-personal-computer-is-dead.html</a></li>
</ol>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/technicallypossible.wordpress.com/572/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/technicallypossible.wordpress.com/572/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/technicallypossible.wordpress.com/572/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/technicallypossible.wordpress.com/572/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/technicallypossible.wordpress.com/572/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/technicallypossible.wordpress.com/572/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/technicallypossible.wordpress.com/572/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/technicallypossible.wordpress.com/572/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/technicallypossible.wordpress.com/572/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/technicallypossible.wordpress.com/572/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/technicallypossible.wordpress.com/572/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/technicallypossible.wordpress.com/572/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/technicallypossible.wordpress.com/572/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/technicallypossible.wordpress.com/572/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=technicallypossible.wordpress.com&amp;blog=10830423&amp;post=572&amp;subd=technicallypossible&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://technicallypossible.wordpress.com/2011/12/18/what-im-reading-1/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<georss:point>10.107640 76.351578</georss:point>
		<geo:lat>10.107640</geo:lat>
		<geo:long>76.351578</geo:long>
		<media:content url="http://0.gravatar.com/avatar/0012a796c32c781113a71f1cbff11480?s=96&#38;d=monsterid&#38;r=G" medium="image">
			<media:title type="html">Neel</media:title>
		</media:content>
	</item>
		<item>
		<title>Virality and the Internet</title>
		<link>http://technicallypossible.wordpress.com/2011/11/28/virality-and-the-internet/</link>
		<comments>http://technicallypossible.wordpress.com/2011/11/28/virality-and-the-internet/#comments</comments>
		<pubDate>Mon, 28 Nov 2011 13:31:32 +0000</pubDate>
		<dc:creator>Neel</dc:creator>
				<category><![CDATA[Advertising]]></category>
		<category><![CDATA[Social]]></category>

		<guid isPermaLink="false">http://technicallypossible.wordpress.com/?p=524</guid>
		<description><![CDATA[I was wondering the other day about who the first used the word viral in the context of the internet and the spread of content. The Wikipedia page on Viral Marketing says the term was first coined by Tim Draper, a venture capitalist. This was in &#8217;96 and the social network in question was probably [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=technicallypossible.wordpress.com&amp;blog=10830423&amp;post=524&amp;subd=technicallypossible&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p><a href="http://technicallypossible.files.wordpress.com/2011/11/04-viral-marketing-150x150.png"><img class="alignright size-full wp-image-576" title="04-Viral-Marketing-150x150" src="http://technicallypossible.files.wordpress.com/2011/11/04-viral-marketing-150x150.png?w=590" alt=""   /></a>I was wondering the other day about who the first used the word <em>viral</em> in the context of the internet and the spread of content. The <a href="http://en.wikipedia.org/wiki/Viral_marketing" target="_blank">Wikipedia page on Viral Marketing</a> says the term was first coined by <a href="http://en.wikipedia.org/wiki/Tim_Draper" target="_blank">Tim Draper</a>, a venture capitalist. This was in &#8217;96 and the social network in question was probably email. From those humble beginnings, virality has turned into a multi-million dollar industry where success stories are measured by <a href="http://www.youtube.com/watch?v=60og9gwKh1o" target="_blank">views</a>, <a href="http://www.addthis.com/blog/2011/07/20/viral-click-tracking-for-all/#.Ts-gD3LQutI" target="_blank">clicks</a> and <a href="http://blogs.ft.com/fttechhub/2011/11/facebook-four-degrees/#axzz1ej4a6DsJ" target="_blank">shares</a>.</p>
<p>So what makes internet content <em>go viral</em>? Are there a set of common factors that can ensure the success of your video or tweet? Is there a silver bullet? Over the last few years a number of companies, researchers and phony social media gurus have tried to explain the phenomenon. But it is still difficult to replicate the success of one campaign directly in another.</p>
<p>A <a href="http://cci.som.yale.edu/sites/cci.som.yale.edu/files/BergerVirality_of_Online_Content.pdf" target="_blank">paper by Berger &amp; Milkman</a> [pdf] looks at an interesting link between the emotion generated by the content and the degree of social transmission, or the <a href="http://en.wikipedia.org/wiki/K-factor_%28marketing%29" target="_blank">virality factor</a>. Articles which generated emotions like awe, anxiety or anger tended to be more viral than content which generated emotions such as sadness.</p>
<p>This brings me to why I am bothering with this right now. I have taken up an unpaid job with <a href="http://mindfulmum.co.uk" target="_blank">mindfulmum.co.uk</a> after I finished my <a href="http://technicallypossible.wordpress.com/2010/05/23/future-plans-information-retrieval/" target="_blank">MSc in Computing Science</a>. Being a startup I get the chance to work in multiple areas &#8211; from advertising to web design and optimization. But my main job is to deal with the social outreach of the site.</p>
<p><img class="alignleft" src="http://www.principalspage.com/theblog/wp-content/uploads//2009/08/HARD-WORK.jpg" alt="" width="287" height="155" />So I have started with the basics. Its only been a few days and we are looking to improve the site layout in such a way that it is easier to share content. Now we are looking into starting a campaign on facebook with the goal of increasing the audience of the site&#8217;s content. There is a lot of hard work still to be done and a site redesign is still in the works. I&#8217;m now thinking of ideas to create a good viral application which can launch the social aspect of the site.</p>
<p>Well, its only been a few days.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/technicallypossible.wordpress.com/524/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/technicallypossible.wordpress.com/524/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/technicallypossible.wordpress.com/524/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/technicallypossible.wordpress.com/524/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/technicallypossible.wordpress.com/524/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/technicallypossible.wordpress.com/524/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/technicallypossible.wordpress.com/524/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/technicallypossible.wordpress.com/524/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/technicallypossible.wordpress.com/524/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/technicallypossible.wordpress.com/524/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/technicallypossible.wordpress.com/524/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/technicallypossible.wordpress.com/524/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/technicallypossible.wordpress.com/524/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/technicallypossible.wordpress.com/524/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=technicallypossible.wordpress.com&amp;blog=10830423&amp;post=524&amp;subd=technicallypossible&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://technicallypossible.wordpress.com/2011/11/28/virality-and-the-internet/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<georss:point>10.107640 76.351578</georss:point>
		<geo:lat>10.107640</geo:lat>
		<geo:long>76.351578</geo:long>
		<media:content url="http://0.gravatar.com/avatar/0012a796c32c781113a71f1cbff11480?s=96&#38;d=monsterid&#38;r=G" medium="image">
			<media:title type="html">Neel</media:title>
		</media:content>

		<media:content url="http://technicallypossible.files.wordpress.com/2011/11/04-viral-marketing-150x150.png" medium="image">
			<media:title type="html">04-Viral-Marketing-150x150</media:title>
		</media:content>

		<media:content url="http://www.principalspage.com/theblog/wp-content/uploads//2009/08/HARD-WORK.jpg" medium="image" />
	</item>
		<item>
		<title>The Effect of Google Instant</title>
		<link>http://technicallypossible.wordpress.com/2011/07/20/the-effect-of-google-instant/</link>
		<comments>http://technicallypossible.wordpress.com/2011/07/20/the-effect-of-google-instant/#comments</comments>
		<pubDate>Wed, 20 Jul 2011 17:06:30 +0000</pubDate>
		<dc:creator>Neel</dc:creator>
				<category><![CDATA[Advertising]]></category>
		<category><![CDATA[IR]]></category>

		<guid isPermaLink="false">http://technicallypossible.wordpress.com/?p=530</guid>
		<description><![CDATA[Quite a few of us have been using Google Instant for a while now. It is the natural progression of the &#8216;search suggestions&#8217;, after Google were convinced it could produce at least some useful predictions. Clearly this speeds up search and except a few irritating instances, it is a favourable development. On the other side [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=technicallypossible.wordpress.com&amp;blog=10830423&amp;post=530&amp;subd=technicallypossible&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p><a href="http://technicallypossible.files.wordpress.com/2011/04/screenshot.png"><img class="aligncenter size-full wp-image-561" title="Screenshot" src="http://technicallypossible.files.wordpress.com/2011/04/screenshot.png?w=590" alt=""   /></a></p>
<p>Quite a few of us have been using<a href="http://www.google.com/instant/"> Google Instant</a> for a while now. It is the natural progression of the &#8216;search suggestions&#8217;, after Google were convinced it could produce at least some useful predictions. Clearly this speeds up search and except a few irritating instances, it is a favourable development.</p>
<p>On the other side of things is the advertising and Search Engine Optimization.Every change that Google makes to its algorithms or UI is immediately dissected by all the specialists and professionals in these 2 fields. So what is their take on Google Instant?</p>
<p>I believe I<strong>nstant has made SEO even more important</strong>. That is, of course, considering SEO in a positive light. Ensuring that a website actually do find themselves where they are supposed to be found on Google. Since Instant reduces the time people spend searching, it implies that being in the top-5 results is crucial. Earlier, being on the first page of Google&#8217;s results was good enough, while now it is looking likely that being in the first screen of Google is what is needed. The <em>Click-through rates (CTR)</em> for sites in the top 3 would be far greater than those below it. This increases the importance of good SEO of a website&#8217;s content.</p>
<p>How does this effect Google&#8217;s search advertising for each query? The <em>Pay-per-Click</em>(PPC) advertising works on a principle of first identifying an impression of the ad. An impression is noted whenever a user :</p>
<ul>
<li>Presses ‘Enter’</li>
<li>Clicks on ‘Search’</li>
<li>Selects a prediction</li>
<li>Stays on page for &gt;3 seconds</li>
<li>Clicks on a result</li>
<li>Clicks on a refinement (maps, news, latest</li>
</ul>
<p>The 3 second window means that ads may get a 3 second window of display even though the query may not be the one which is advertised. Also, there will a be a number of key-words which will fall behind in the CTR counts due to the changes.</p>
<p>All that can be said just now is that this is not a game-changer, though it will upset a number of the previously accepted designs and trends in SEO and PPC.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/technicallypossible.wordpress.com/530/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/technicallypossible.wordpress.com/530/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/technicallypossible.wordpress.com/530/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/technicallypossible.wordpress.com/530/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/technicallypossible.wordpress.com/530/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/technicallypossible.wordpress.com/530/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/technicallypossible.wordpress.com/530/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/technicallypossible.wordpress.com/530/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/technicallypossible.wordpress.com/530/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/technicallypossible.wordpress.com/530/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/technicallypossible.wordpress.com/530/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/technicallypossible.wordpress.com/530/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/technicallypossible.wordpress.com/530/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/technicallypossible.wordpress.com/530/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=technicallypossible.wordpress.com&amp;blog=10830423&amp;post=530&amp;subd=technicallypossible&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://technicallypossible.wordpress.com/2011/07/20/the-effect-of-google-instant/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<georss:point>10.107640 76.351578</georss:point>
		<geo:lat>10.107640</geo:lat>
		<geo:long>76.351578</geo:long>
		<media:content url="http://0.gravatar.com/avatar/0012a796c32c781113a71f1cbff11480?s=96&#38;d=monsterid&#38;r=G" medium="image">
			<media:title type="html">Neel</media:title>
		</media:content>

		<media:content url="http://technicallypossible.files.wordpress.com/2011/04/screenshot.png" medium="image">
			<media:title type="html">Screenshot</media:title>
		</media:content>
	</item>
		<item>
		<title>Identifying Queries which Demand Recency Sensitive Results in Web Search</title>
		<link>http://technicallypossible.wordpress.com/2011/03/13/identifying-queries-which-demand-recency-sensitive-results-in-web-search/</link>
		<comments>http://technicallypossible.wordpress.com/2011/03/13/identifying-queries-which-demand-recency-sensitive-results-in-web-search/#comments</comments>
		<pubDate>Sun, 13 Mar 2011 23:16:19 +0000</pubDate>
		<dc:creator>Neel</dc:creator>
				<category><![CDATA[IR]]></category>
		<category><![CDATA[Tech]]></category>

		<guid isPermaLink="false">http://technicallypossible.wordpress.com/?p=528</guid>
		<description><![CDATA[Since joining University of Glasgow for my Master&#8217;s I haven&#8217;t written a blog post. The main reason for this has been the amount of coursework which I have had to wade through in the last few months. As part of a Research Methods and Techniques course in Semester 1, we had to submit a literature [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=technicallypossible.wordpress.com&amp;blog=10830423&amp;post=528&amp;subd=technicallypossible&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Since joining University of Glasgow for my Master&#8217;s I haven&#8217;t written a blog post. The main reason for this has been the amount of coursework which I have had to wade through in the last few months.</p>
<p>As part of a Research Methods and Techniques course in Semester 1, we had to submit a literature review on a topic of our choice. I chose the above topic since I had an inclination towards how freshness of web search results was maintained. I got interested in the topic after reading the papers by Dong <em>et al</em>. on the topic. [3,8]. I got a decent grade but, obviously, there were a huge number of errors. I thought it would be useful to blog it &#8211; just to give me a reason to correct some of  the mistakes.</p>
<p><em><strong>WARNING</strong>: This is a lot of text for one blog post.</em></p>
<p><em>Title:</em> <span style="text-decoration:underline;"><strong>Identifying queries which demand recency sensitive results in web search</strong></span></p>
<p><strong>1. Introduction</strong></p>
<p>Recency ranking in web search deals with the incorporation of temporal aspects of documents into the ranking model. The aim in recency ranking is to provide results to the user which reflect the freshness of the documents without compromising on the relevance of the results. Recency, as an aspect of web search, started gaining importance with the advent of news-search. More recently, it is the demand for real-time search results which has been driving research into ranking algorithms which take freshness of content into account.</p>
<p>Dong <em>et al</em>. [3] states that there are two major challenges in providing recency based ranking in web search:</p>
<ul>
<li><em>Firstly</em>, The search engine has to identify which queries require recency sensitive ranking. This is important since recency sensitive ranking algorithms tend to perform poorly when used for queries where the retrieved data is relatively static.</li>
<li><em>Secondly</em>, once the query has been identified, ranking recently crawled content in the absence of rich historical data, such as click-through rates and in-links is a major challenge. This is because prestige based ranking algorithms like PageRank [7] give a lot of weighting to such features.</li>
</ul>
<p>Both challenges are closely related since the method used to identify queries which need recency sensitive treatment must drive the method used in the ranking. In this literature review, we will be looking at the solutions proposed to solve the first challenge, though we will touch upon some relevant relations between the two stated problems. Broadly, the solutions are from the fields of news search [1, 4], time-series analysis [2, 3, 5] and data mining [6].</p>
<p>In Section 2, we will first examine the problem is detail. Section 3 deals with the critical examination of the methods used to solve the problem. And section 4, presents a conclusion which takes a holistic approach to the problem and tries to identify a research gap.</p>
<p><strong>2. Problem Statement</strong></p>
<p>In traditional information retrieval, documents are considered as static entities which do not change with time. This implies that the relevance of a result set associated with a query is also seen as a constant. But new content is generated on the web very frequently in the form of blogs, news, tweets and other user-generated content. Thus in web search, users demand  results which reflect the recency of the content while maintaining the relevance of the result set. Thus recency ranking is indispensable.</p>
<p>One way of approaching recency ranking would be to introduce a feature, as part of the ranking algorithm, which gives a certain weight to the recency of a document irrespective of the query. But Dong <em>et al. </em>[3] have shown that recency algorithms perform poorly when applied to queries with a static information need. For example, “liver” is a static query. The result set remains fairly constant year-on-year. Applying recency ranking to this query will result in a long established result losing out.   Hence there is a need for identification of recency sensitive queries before ranking.</p>
<p>When a user enters a query they may or may not specify whether their intent is recency sensitive. For example, the intent behind a query such as “earthquake” could be interpreted in two ways. The user may be looking for geological information regarding earthquakes in general or they may be looking for information regarding a specific, recent earthquake. Since users do not clearly specify their information need in terms of recency, identification of such queries becomes a challenging problem.</p>
<p>It is not possible to identify the intent of the user query by analysing the query on its own. Thus, the methods used to tackle this problem consider more other sources to perform recency classification of the query.</p>
<p><strong>3. Methods Used to Solve Problem</strong></p>
<p>The methods proposed to solve the above problem can be broadly divided into three classes. They are:</p>
<ol>
<li> <strong><em>News Vertical solutions</em></strong> : This consists of methods which are focused only on the news vertical. The solutions proposed in [1, 4] perform well on news-related queries but have not been proved successful on a more general query.</li>
<li> <strong><em>Temporal Modelling</em></strong> : It is often difficult to capture the recency intent of a user using only a single query. Methods put forward in [2, 3, 5 ] deal with either modelling of queries and time spaces looking for discrepancies from the normal frequencies. Discrepancies can be viewed as reason to provide recency sensitive results.</li>
<li> <strong><em>Composite model </em></strong>: Zhang <em>et al.</em> [6] proposes a model which takes into account the time series, query log analysis, click through data and other features and feeds it into a machine learning model to produce a classifier. Using all the above measures, the classifier can identify queries which require recency sensitive ranking.</li>
</ol>
<p><strong>3.1 News Specific Solutions:</strong></p>
<p>There are 2 closely related approaches in this class. Diaz [1] and Konig <em>et al.</em> [4] both aim to find the same solution to the problem. Both aim to predict whether a user will click on the dedicated news, if displayed, for a particular query. The approaches taken by both are different and they have their advantages and disadvantages, Diaz [1] proposes a method which uses the fact that there exists a relation between click-through rates and recency-sensitive, news related queries. Here first the probability of a user clicking the separate news display if it is indeed displayed for a particular query, is estimated. An initial guess at the probability is made from the context of the query is made. This is then presented as a Beta distribution such that it depends on both the assumption as well as past clicks.</p>
<p>Query similarity is considered and then since the probability is defined, a threshold value is set. If the probability of the query getting clicks is higher than the threshold, then results are shown. This threshold value is leniently set so that more queries have an opportunity to get exposure. Most importantly, different from the approach taken by Konig <em>et al</em>. [4], click feedback is used to correct any errors made by the algorithm. It is shown experimentally that in case of an erroneous judgement regarding the showing of news results, since that result would not receive any clicks, the judgement will be reversed in time automatically.</p>
<p>It was seen that using click feedback improves on the baseline taken in the experiments. The leniency introduced in the threshold value did not improve the overall performance. This was due to poor classification of queries which were under the threshold value.</p>
<p>In the method proposed by Konig<em> et al</em>. [4], first a set of features are considered from the corpora. Then the corpus is defined to include wikipedia, blogs and news. After feature extraction, like tf-idf, cosine similarity, Jensen-Shannon divergence,  the learning model is applied. The learning model used is the Multiple Additive Regression Trees model or MART. It is built on Stochastic Gradient Boosting. It is claimed that MART has better accuracy and can handle the rich feature set provided it. Randomness is added to the algorithm to improve robustness. This algorithm is then applied to the training data and after making sure that there is no overlap between the training data and the test-collection, the testing is done. Quantitative CTR prediction for future queries is observed to be very accurate. PR curves for the threshold CTR values shown below (as seen in paper). Overall prediction success = 82%.</p>
<p>The major difference between the methods proposed by Konig <em>et al.</em> [4] and Diaz [1] lie in the usage of temporal features and machine-learning algorithms used. Both methods are biased towards news related queries and so fail to answer questions relating to coverage of different queries.</p>
<p><strong>3.2 Temporal Modelling</strong></p>
<p>This class of methods approaches the problem by modelling the temporal aspects of either the queries or the results of the queries to identify which ones are recency-sensitive.</p>
<p>The methods put forward by Vlachos <em>et al</em>. [5] and Dong <em>et al. </em>[3] are  opposite in their approach. Vlachos <em>et al.</em> [5] proposed a method that follows the  topic frequency over different time-slots, detecting any discrepancies in the frequency or “bursts” of activity.  But in the methods proposed by Dong et al. [3], time slots are modelled and compared with each other.</p>
<p>Vlachos <em>et al. </em>[5] introduce methods to identify important features in the time series and also to identify bursts in activity. The paper aims to match queries with temporal bursts in activity. The aim of their compression algorithm is to minimize the Euclidean distance between the query and the time-series representation. Best Fourier coefficients are used to  represent the compression because they describe the series better. 3 algorithms are put forward to improve the similarity search of the query with the compressed time-series – BestMin, BestError and BestMinError.</p>
<p>A cumulative distribution function of the time-series is used and the high-frequency of queries during important periods will be significantly off from the mean. Thus the periods with maximum query activity are identified. Finally, to identify bursts in query activity which are out of the ordinary, first identify and then compress the data to enable comparison later. The storage can be done on any relational database. For the detection phase, moving averages are used over the time-series. If a power value was far higher that the moving average, then it was deemed to be a burst. Next, to find all queries with similar burst patterns, the features and compressed and then compared. Similarity and overlap between the bursts are checked. Thus it becomes possible to identify a set of queries which show a spike in activity.</p>
<p>Dong<em> et al</em>. [3] proposed a method which aimed to classify a classify a query as recency sensitive. Time-slots, as opposed to topics or queries, were modelled. N-gram models were made for content and queries in each time slot and compared with past time-slot models. Any discrepancy in content and query models would then show as a recency trend. A final &#8216;buzz&#8217; score is computed for every query. If this value exceeds a certain threshold, the query is considered recency sensitive. To determine the ideal constants of the function, some manual classification was done. Through the reported experiments we see that the query classifier has an accuracy approaching 90% at times when there is high breaking-news traffic. At other times the classifier doesn&#8217;t work as well.</p>
<p>Finally, as part of the recency ranking framework, four separate recency features were considered and compared -Timestamp, Linktime, WebBuzz and Page classification. The ranking used machine learning methods using recency training data as well as regular relevance training data. GBrank was the machine-learning algorithm used.  Three ranking models were also tested &#8211; Composition Model, Overweighting Model and the Adaptation Model.</p>
<p>Another method which defines a query temporal model is proposed by Diaz &amp; Jones [2]. This paper looks at the temporal nature of query results with the aim of predicting the relevance of the result set. A temporal profile is made for the query based on timestamps of the top 5 documents retrieved. The  granularity is defined at 1 day. Smoothing of the model is required since some days may have an awkward spike in query traffic while other days may have no traffic at all. This concept of smoothing has been discussed in time series analysis. Once the model is prepared, features are extracted to predict the relevance of the result set. This method of modelling queries is both simple and elegant from a theoretical point of view. From the context of this paper, it was effective as well.</p>
<p><strong>3.3 Composite Modelling</strong></p>
<p>While the above models considered certain models and concepts to solve the recency-sensitive query  identification problem, none have combined all the relevant metrics to see which ones perform the best.</p>
<p>In the paper by Zhang <em>et al. </em>[6], a very good coverage of metrics is seen to solve a unique problem. A  number of queries occur at regular time intervals but require recency sensitive handling. E.g. “SIGIR”. In 2010 – the results should focus on SIGIR2010 and not on past conferences. These queries are called <em>Recurrent Event Queries</em>, or REQ. The time at which these queries occur is predictable and they account for 6% of all queries issued and hence are worth dealing with.</p>
<p>The paper presents an REQ classifier which uses a wide range of features together. Also they are all combined by machine learning algorithms. The paper defines 3 different machine learning algorithms and compares them to find the most effective.</p>
<p>The paper uses a number of features which it then applies to a machine learning framework. First deals classifying queries into ones with explicit time, implicit time, and no time  markings. Query log analysis is used to obtain metrics like whether a query has an implicit or explicit temporal nature, the frequency of that query, how many different years are has that query been on for and a chi-square distance. Metrics on changes to queries done by users during a session are also included. Click log analysis is used to get the click-through rates of the queries. The result sets of implicit time queries are checked as to whether they show any year features. A list is made with a series of REQ words which are tokenized from implicit queries.</p>
<p>Three different learning algorithms are used to combine the above metrics – Naive Bayes, SVM [9] and Gradient Boosted Decision Tree (GBDT). For the purpose of evaluating the effectiveness of the learning algorithms, 6000 queries were marked as REQ or not manually by human editors.</p>
<p>After running the algorithms evaluation was done using a PR graph. Precision was determined as REQ/(total classified REQ). Recall was defined as REQ/(all REQ). Finally from the experiments it was found that the Gradient Boosted Decision Tree machine learning algorithm was the one which performed the best. The ratio of the no. of times a query is issued with year a qualifier to without, was the most important feature. And finally 3.6% gain in DCG at the top position was observed against regular existing web search.</p>
<p><strong>4. Future Scope and Research Gap</strong></p>
<p>There is work  still to be done in the field of recency query identification. Twitter is a source of real-time news. It still needs to be investigated if mined Twitter data can be used effectively to identify recency sensitive queries. Also, incorporating data from book-marking sites like Digg or Del.io.us can be used as an indicator of recency sensitivity of a related query.</p>
<p><strong>5. Bibliography</strong></p>
<p>[1] Fernando Diaz. 2009. <strong>Integration of news content into web results.</strong> In Proceedings of the Second ACM International Conference on Web Search and Data Mining (WSDM &#8217;09), Ricardo Baeza-Yates, Paolo Boldi, Berthier Ribeiro-Neto, and B. Barla Cambazoglu (Eds.). ACM, 182-191. <a href="http://doi.acm.org/10.1145/1498759.1498825">http://doi.acm.org/10.1145/1498759.1498825</a></p>
<p>[2] 	Fernando Diaz and Rosie Jones. 2004. <strong>Using temporal profiles of queries for precision prediction</strong>. In Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval (SIGIR &#8217;04). ACM, USA, 18-24.  <a href="http://doi.acm.org/10.1145/1008992.1008998">http://doi.acm.org/10.1145/1008992.1008998</a></p>
<p>[3] Anlei Dong, Yi Chang, Zhaohui Zheng, Gilad Mishne, Jing Bai, Ruiqiang Zhang, Karolina Buchner, Ciya Liao, and Fernando Diaz. 2010. <strong>Towards recency ranking in web search.</strong> In Proceedings of the third ACM international conference on Web search and data mining (WSDM &#8217;10). ACM, USA, 11-20. <a href="http://doi.acm.org/10.1145/1718487.1718490">http://doi.acm.org/10.1145/1718487.1718490</a></p>
<p>[4] Arnd Christian Konig, Michael Gamon, and Qiang Wu. 2009. <strong>Click-through prediction for news queries.</strong> In Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval (SIGIR &#8217;09). ACM, USA, 347-354. <a href="http://doi.acm.org/10.1145/1571941.1572002">http://doi.acm.org/10.1145/1571941.1572002</a></p>
<p>[5] Michail Vlachos, Christopher Meek, Zografoula Vagena, and Dimitrios Gunopulos. 2004. <strong>Identifying similarities, periodicities and bursts for online search queries</strong>. In Proceedings of the 2004 ACM SIGMOD international conference on Management of data (SIGMOD &#8217;04). ACM, USA, 131-142. <a href="http://doi.acm.org/10.1145/1007568.1007586">http://doi.acm.org/10.1145/1007568.1007586</a></p>
<p>[6] Ruiqiang Zhang, Yuki Konda, Anlei Dong, Pranam Kolari, Yi Chang, and Zhaohui Zheng. 2010. <strong>Learning recurrent event queries for web search.</strong> In Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing (EMNLP &#8217;10). Association for Computational Linguistics, Morristown, NJ, USA, 1129-1139. <a href="http://portal.acm.org/citation.cfm?id=1870658.1870768">http://portal.acm.org/citation.cfm?id=1870658.1870768</a></p>
<p>[7] PageRank</p>
<p>[8] Twitter for recency ranking dong et al.</p>
<p>[9] Corinna Cortes and V. Vapnik, &#8220;Support-Vector Networks&#8221;, Machine Learning, 20, 1995. <a rel="nofollow" href="http://www.springerlink.com/content/k238jx04hm87j80g/">http://www.springerlink.com/content/k238jx04hm87j80g/</a></p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/technicallypossible.wordpress.com/528/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/technicallypossible.wordpress.com/528/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/technicallypossible.wordpress.com/528/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/technicallypossible.wordpress.com/528/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/technicallypossible.wordpress.com/528/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/technicallypossible.wordpress.com/528/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/technicallypossible.wordpress.com/528/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/technicallypossible.wordpress.com/528/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/technicallypossible.wordpress.com/528/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/technicallypossible.wordpress.com/528/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/technicallypossible.wordpress.com/528/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/technicallypossible.wordpress.com/528/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/technicallypossible.wordpress.com/528/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/technicallypossible.wordpress.com/528/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=technicallypossible.wordpress.com&amp;blog=10830423&amp;post=528&amp;subd=technicallypossible&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://technicallypossible.wordpress.com/2011/03/13/identifying-queries-which-demand-recency-sensitive-results-in-web-search/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<georss:point>10.107640 76.351578</georss:point>
		<geo:lat>10.107640</geo:lat>
		<geo:long>76.351578</geo:long>
		<media:content url="http://0.gravatar.com/avatar/0012a796c32c781113a71f1cbff11480?s=96&#38;d=monsterid&#38;r=G" medium="image">
			<media:title type="html">Neel</media:title>
		</media:content>
	</item>
		<item>
		<title>Why GNU grep is fast</title>
		<link>http://technicallypossible.wordpress.com/2010/08/30/why-gnu-grep-is-fast/</link>
		<comments>http://technicallypossible.wordpress.com/2010/08/30/why-gnu-grep-is-fast/#comments</comments>
		<pubDate>Mon, 30 Aug 2010 20:20:22 +0000</pubDate>
		<dc:creator>Neel</dc:creator>
				<category><![CDATA[Open Source]]></category>
		<category><![CDATA[gnu]]></category>
		<category><![CDATA[grep]]></category>

		<guid isPermaLink="false">http://technicallypossible.wordpress.com/?p=471</guid>
		<description><![CDATA[I thought I&#8217;d blog this. This is a message from Mike Haertel (the chap who wrote grep) on the free-bsd mailing list explaining why grep is fast. Hi Gabor, I am the original author of GNU grep. I am also a FreeBSD user, although I live on -stable (and older) and rarely pay attention to [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=technicallypossible.wordpress.com&amp;blog=10830423&amp;post=471&amp;subd=technicallypossible&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>I thought I&#8217;d blog this. This is a <a href="http://lists.freebsd.org/pipermail/freebsd-current/2010-August/019310.html">message from Mike Haertel</a> (the chap who wrote grep) on the free-bsd mailing list explaining why grep is fast.</p>
<blockquote>
<pre>Hi Gabor,

I am the original author of GNU grep.  I am also a FreeBSD user,
although I live on -stable (and older) and rarely pay attention
to -current.

However, while searching the -current mailing list for an unrelated
reason, I stumbled across some flamage regarding BSD grep vs GNU grep
performance.  You may have noticed that discussion too...

Anyway, just FYI, here's a quick summary of where GNU grep gets
its speed.  Hopefully you can carry these ideas over to BSD grep.

#1 trick: GNU grep is fast because it AVOIDS LOOKING AT
EVERY INPUT BYTE.

#2 trick: GNU grep is fast because it EXECUTES VERY FEW
INSTRUCTIONS FOR EACH BYTE that it *does* look at.

GNU grep uses the well-known Boyer-Moore algorithm, which looks
first for the final letter of the target string, and uses a lookup
table to tell it how far ahead it can skip in the input whenever
it finds a non-matching character.

GNU grep also unrolls the inner loop of Boyer-Moore, and sets up
the Boyer-Moore delta table entries in such a way that it doesn't
need to do the loop exit test at every unrolled step.  The result
of this is that, in the limit, GNU grep averages fewer than 3 x86
instructions executed for each input byte it actually looks at
(and it skips many bytes entirely).

See "Fast String Searching", by Andrew Hume and Daniel Sunday,
in the November 1991 issue of Software Practice &amp; Experience, for
a good discussion of Boyer-Moore implementation tricks.  It's
available as a free PDF online.

Once you have fast search, you'll find you also need fast input.

GNU grep uses raw Unix input system calls and avoids copying data
after reading it.

Moreover, GNU grep AVOIDS BREAKING THE INPUT INTO LINES.  Looking
for newlines would slow grep down by a factor of several times,
because to find the newlines it would have to look at every byte!

So instead of using line-oriented input, GNU grep reads raw data into
a large buffer, searches the buffer using Boyer-Moore, and only when
it finds a match does it go and look for the bounding newlines.
(Certain command line options like -n disable this optimization.)

Finally, when I was last the maintainer of GNU grep (15+ years ago...),
GNU grep also tried very hard to set things up so that the *kernel*
could ALSO avoid handling every byte of the input, by using mmap()
instead of read() for file input.  At the time, using read() caused
most Unix versions to do extra copying.  Since GNU grep passed out
of my hands, it appears that use of mmap became non-default, but you
can still get it via --mmap.  And at least in cases where the data
is already file system buffer caches, mmap is still faster:

  $ time sh -c 'find . -type f -print | xargs grep -l 123456789abcdef'
  real	0m1.530s
  user	0m0.230s
  sys	0m1.357s
  $ time sh -c 'find . -type f -print | xargs grep --mmap -l 123456789abcdef'
  real	0m1.201s
  user	0m0.330s
  sys	0m0.929s

[workload was a 648 megabyte MH mail folder containing 41000 messages]
So even nowadays, using --mmap can be worth a &gt;20% speedup.

Summary:

- Use Boyer-Moore (and unroll its inner loop a few times).

- Roll your own unbuffered input using raw system calls.  Avoid copying
  the input bytes before searching them.  (Do, however, use buffered
  *output*.  The normal grep scenario is that the amount of output is
  small compared to the amount of input, so the overhead of output
  buffer copying is small, while savings due to avoiding many small
  unbuffered writes can be large.)

- Don't look for newlines in the input until after you've found a match.

- Try to set things up (page-aligned buffers, page-sized read chunks,
  optionally use mmap) so the kernel can ALSO avoid copying the bytes.

The key to making programs fast is to make them do practically nothing. <img src='http://s1.wp.com/wp-includes/images/smilies/icon_wink.gif' alt=';-)' class='wp-smiley' /> 

Regards,

	Mike</pre>
</blockquote>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/technicallypossible.wordpress.com/471/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/technicallypossible.wordpress.com/471/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/technicallypossible.wordpress.com/471/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/technicallypossible.wordpress.com/471/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/technicallypossible.wordpress.com/471/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/technicallypossible.wordpress.com/471/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/technicallypossible.wordpress.com/471/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/technicallypossible.wordpress.com/471/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/technicallypossible.wordpress.com/471/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/technicallypossible.wordpress.com/471/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/technicallypossible.wordpress.com/471/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/technicallypossible.wordpress.com/471/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/technicallypossible.wordpress.com/471/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/technicallypossible.wordpress.com/471/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=technicallypossible.wordpress.com&amp;blog=10830423&amp;post=471&amp;subd=technicallypossible&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://technicallypossible.wordpress.com/2010/08/30/why-gnu-grep-is-fast/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<georss:point>10.107640 76.351578</georss:point>
		<geo:lat>10.107640</geo:lat>
		<geo:long>76.351578</geo:long>
		<media:content url="http://0.gravatar.com/avatar/0012a796c32c781113a71f1cbff11480?s=96&#38;d=monsterid&#38;r=G" medium="image">
			<media:title type="html">Neel</media:title>
		</media:content>
	</item>
		<item>
		<title>Who is Ryan Aguillard?!</title>
		<link>http://technicallypossible.wordpress.com/2010/08/28/who-is-ryan-aguillard/</link>
		<comments>http://technicallypossible.wordpress.com/2010/08/28/who-is-ryan-aguillard/#comments</comments>
		<pubDate>Sat, 28 Aug 2010 12:02:01 +0000</pubDate>
		<dc:creator>Neel</dc:creator>
				<category><![CDATA[JFF]]></category>
		<category><![CDATA[Social]]></category>
		<category><![CDATA[facebook]]></category>
		<category><![CDATA[ryan aguillard]]></category>

		<guid isPermaLink="false">http://technicallypossible.wordpress.com/?p=497</guid>
		<description><![CDATA[Try searching on Google for facebook. The results are all as they should be &#8211; with the right sites at the very top. All is well. Until you look closely. Check the sub-links within the first result. What is that link below Login doing there?! Ryan Aguillard. It links to the  facebook internationalization page( http://www.facebook.com/index.php/Internationalization [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=technicallypossible.wordpress.com&amp;blog=10830423&amp;post=497&amp;subd=technicallypossible&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Try searching on Google for <strong>facebook</strong>.<br />
The results are all as they should be &#8211; with the right sites at the very top.<br />
All is well. Until you look closely.</p>
<p><a href="http://technicallypossible.files.wordpress.com/2010/08/screenshot.png"><img class="aligncenter size-full wp-image-509" title="Ryan Aguillard?" src="http://technicallypossible.files.wordpress.com/2010/08/screenshot.png?w=590" alt=""   /></a></p>
<p>Check the sub-links within the first result. What is that link below<em> Login</em> doing there?!</p>
<p><strong>Ryan Aguillard.</strong><br />
It links to the  <em>facebook internationalization</em> page( http://www.facebook.com/index.php/Internationalization ) which redirects to my home page.</p>
<p>What is that doing there, Google?</p>
<p><strong>Who is Ryan Aguillard? </strong></p>
<p>A quick Google &amp; facebook search yields no results. All the links are generic. The only links which seem promising are a forum post &#8211; with no replies by the way &#8211; asking the same question. Also, a link to a Huffington Post profile which looks unused.</p>
<p><strong><em>Update 1: </em></strong></p>
<p><em>Recently (Aug 20, 2010), Google <a href="http://googlewebmastercentral.blogspot.com/2010/08/showing-more-results-from-domain.html">announced </a>that they will be showing  increased number of results from each domain, if the search is focused  enough on a single domain.</em></p>
<p><em>They said, &#8220;Today we’ve launched a change to our ranking algorithm that will make it  much easier for users to find a large number of results from a single  site.  For queries that indicate a strong user interest in a particular  domain, we’ll now show more results from the relevant site. We’re always reassessing our ranking and user interface, making hundreds  of changes each year.  We expect today’s improvement will help users  find deeper results from a single site, while still providing diversity  on the results page.&#8221;</em> <em></em></p>
<p><em>The aberration that is Ryan Aguillard is probably a result of this new algorithm doing something stupid.</em></p>
<p><strong><em>Update 2:</em></strong></p>
<p><em>I received a huge spike in traffic yesterday on Aug 31st 2010.<br />
It also seems the problem has been set straight on Sept 1st 2010.<br />
I guess thats the end of that.<br />
</em></p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/technicallypossible.wordpress.com/497/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/technicallypossible.wordpress.com/497/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/technicallypossible.wordpress.com/497/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/technicallypossible.wordpress.com/497/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/technicallypossible.wordpress.com/497/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/technicallypossible.wordpress.com/497/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/technicallypossible.wordpress.com/497/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/technicallypossible.wordpress.com/497/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/technicallypossible.wordpress.com/497/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/technicallypossible.wordpress.com/497/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/technicallypossible.wordpress.com/497/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/technicallypossible.wordpress.com/497/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/technicallypossible.wordpress.com/497/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/technicallypossible.wordpress.com/497/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=technicallypossible.wordpress.com&amp;blog=10830423&amp;post=497&amp;subd=technicallypossible&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://technicallypossible.wordpress.com/2010/08/28/who-is-ryan-aguillard/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		<georss:point>10.107640 76.351578</georss:point>
		<geo:lat>10.107640</geo:lat>
		<geo:long>76.351578</geo:long>
		<media:content url="http://0.gravatar.com/avatar/0012a796c32c781113a71f1cbff11480?s=96&#38;d=monsterid&#38;r=G" medium="image">
			<media:title type="html">Neel</media:title>
		</media:content>

		<media:content url="http://technicallypossible.files.wordpress.com/2010/08/screenshot.png" medium="image">
			<media:title type="html">Ryan Aguillard?</media:title>
		</media:content>
	</item>
		<item>
		<title>SixthSense and Acceptable Gestures</title>
		<link>http://technicallypossible.wordpress.com/2010/08/27/sixthsense-and-acceptable-gestures/</link>
		<comments>http://technicallypossible.wordpress.com/2010/08/27/sixthsense-and-acceptable-gestures/#comments</comments>
		<pubDate>Fri, 27 Aug 2010 20:11:43 +0000</pubDate>
		<dc:creator>Neel</dc:creator>
				<category><![CDATA[JFF]]></category>
		<category><![CDATA[HCI]]></category>

		<guid isPermaLink="false">http://technicallypossible.wordpress.com/?p=468</guid>
		<description><![CDATA[I read a blog post recently by Nick Jones talking about some interesting research from University of Glasgow relating to Human Computer Interaction. The research looked at gestures which people would be willing to make in different situations and those which people would not. The research paper, titled Usable Gestures for Mobile Interfaces: Evaluating Social [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=technicallypossible.wordpress.com&amp;blog=10830423&amp;post=468&amp;subd=technicallypossible&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>I read a <a href="http://blogs.gartner.com/nick_jones/2010/07/26/don%E2%80%99t-wave-at-me-like-that/">blog post</a> recently by <a href="http://www.gartner.com/AnalystBiography?authorId=6625">Nick Jones</a> talking about some interesting research from <a href="http://www.dcs.gla.ac.uk/gist/">University of Glasgow</a> relating to Human Computer Interaction. The research looked at gestures which people would be willing to make in different situations and those which people would not.</p>
<p>The research paper, titled <strong><em><a href="http://www.julierico.com/pap0238-rico.pdf">Usable Gestures for Mobile Interfaces: Evaluating Social Acceptability</a> </em></strong>by <a href="http://www.julierico.com/index.shtml">Julie Rico</a> and <a href="http://www.dcs.gla.ac.uk/~stephen/">Stephen Brewster</a>, looks at what gestures people will be willing to make using mobile devices in public places.</p>
<blockquote><p><em>The on-the-street user study demonstrated that user acceptance of gestures is increased after even one positive experience. <strong>The survey demonstrated the important role that observers play in social acceptability, with highly acceptable gestures including subtle imitations of everyday gestures and gestures with highly visible cues demonstrating their role as an interaction with an interface.</strong> These results provide researchers with concrete tools that can be used to assess the social acceptability of multimodal interaction techniques at an early stage of development.</em></p></blockquote>
<p><img class="alignright" title="Sixth Sense" src="http://technicallypossible.files.wordpress.com/2010/08/sixthsense01.jpg?w=226&#038;h=261" alt="" width="226" height="261" />Case in point now is the much hyped <a href="http://en.wikipedia.org/wiki/SixthSense">Sixth Sense</a> device designed by the Indian, <a href="http://www.pranavmistry.com/">Pranav Mistry</a>. In one of the examples in the demo, he shows the device projecting something onto another person. As Nick Jones points out, how many people would enjoy that? How long would you last before you get punched by the boyfriend of the girl whose T-Shirt you&#8217;re reading from?</p>
<p>Coolness of the product should not blind us from the real utility of the product. It should also not blond us from the acceptability of what we do with it.</p>
<p>I hope mobile device designers read this and not make people openly point at people or things, or make us do idiotic gestures.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/technicallypossible.wordpress.com/468/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/technicallypossible.wordpress.com/468/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/technicallypossible.wordpress.com/468/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/technicallypossible.wordpress.com/468/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/technicallypossible.wordpress.com/468/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/technicallypossible.wordpress.com/468/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/technicallypossible.wordpress.com/468/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/technicallypossible.wordpress.com/468/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/technicallypossible.wordpress.com/468/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/technicallypossible.wordpress.com/468/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/technicallypossible.wordpress.com/468/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/technicallypossible.wordpress.com/468/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/technicallypossible.wordpress.com/468/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/technicallypossible.wordpress.com/468/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=technicallypossible.wordpress.com&amp;blog=10830423&amp;post=468&amp;subd=technicallypossible&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://technicallypossible.wordpress.com/2010/08/27/sixthsense-and-acceptable-gestures/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<georss:point>10.107640 76.351578</georss:point>
		<geo:lat>10.107640</geo:lat>
		<geo:long>76.351578</geo:long>
		<media:content url="http://0.gravatar.com/avatar/0012a796c32c781113a71f1cbff11480?s=96&#38;d=monsterid&#38;r=G" medium="image">
			<media:title type="html">Neel</media:title>
		</media:content>

		<media:content url="http://technicallypossible.files.wordpress.com/2010/08/sixthsense01.jpg?w=260" medium="image">
			<media:title type="html">Sixth Sense</media:title>
		</media:content>
	</item>
		<item>
		<title>Basics of Ubuntu Development &#8211; Part 1</title>
		<link>http://technicallypossible.wordpress.com/2010/08/27/basics-of-ubuntu-development-part-1/</link>
		<comments>http://technicallypossible.wordpress.com/2010/08/27/basics-of-ubuntu-development-part-1/#comments</comments>
		<pubDate>Fri, 27 Aug 2010 09:55:46 +0000</pubDate>
		<dc:creator>Neel</dc:creator>
				<category><![CDATA[Open Source]]></category>
		<category><![CDATA[Tech]]></category>
		<category><![CDATA[linux]]></category>
		<category><![CDATA[ubuntu]]></category>

		<guid isPermaLink="false">http://technicallypossible.wordpress.com/?p=407</guid>
		<description><![CDATA[Ubuntu Developer Week was on some time back. I hung around for the start where Daniel Holbach who started things off with the &#8220;Getting Started with Ubuntu Development&#8221; IRC chat. Really good talk (chat) and lot of questions answered too (though I was more of the passive listener tonight). The problem, I felt, was that [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=technicallypossible.wordpress.com&amp;blog=10830423&amp;post=407&amp;subd=technicallypossible&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p><a href="http://technicallypossible.files.wordpress.com/2010/07/ubuntu-logo11.jpg"><img class="alignright size-medium wp-image-423" title="ubuntu-logo1" src="http://technicallypossible.files.wordpress.com/2010/07/ubuntu-logo11.jpg?w=300&#038;h=274" alt="" width="300" height="274" /></a>Ubuntu Developer Week was on some time back. I hung around for the start where <a href="http://daniel.holba.ch/blog/">Daniel Holbach</a> who started things off with the &#8220;<strong><em>Getting Started with Ubuntu Development</em></strong>&#8221; IRC chat. Really good talk (chat) and lot of questions answered too (though I was more of the passive listener tonight). The problem, I felt, was that an IRC chat is very real-time and difficult to use as a reference when you actually get down to working on something. So, I thought it would be a good idea to blog about it. I hope all the descriptions etc are good and right.</p>
<p><em>If anyone wants to read the chat-logs you can find it<a href="https://wiki.ubuntu.com/MeetingLogs/devweek1007/GetStarted"> here.</a></em></p>
<p>So, here goes!</p>
<p>The first page which you can use is the Ubuntu wiki page &#8211; <a href="https://wiki.ubuntu.com/UbuntuDevelopment/UsingDevelopmentReleases">UsingDevelopmentReleases</a>. It has tonnes of info and links that you can use during your development time. The first thing you&#8217;ll need is an environment to develop in. A place where whatever crap you end up doing isn&#8217;t going to damage your happy Ubuntu installation. One of the things you can use is <code>chroot</code></p>
<p><strong>Installing <code>chroot</code></strong></p>
<p>Easy thing, this. Run these commands to get the installation done<br />
<code>sudo apt-get install chroot<br />
sudo apt-get install debootstrap</code></p>
<p>If you haven&#8217;t already enabled &#8220;<em>Source Code</em>&#8221; and &#8220;<em>Universe</em>&#8221; in <em>System</em> → <em>Software Sources</em> → <em>Ubuntu Software </em>you can do that now.</p>
<p><em><strong>Note:</strong> We&#8217;ll be dealing with some basic bug-fixes etc first. This is not going to be gnome centric (which is a natural assumption that many will have considering its Ubuntu)</em></p>
<p>Run the following command :</p>
<p><code>sudo apt-get install --no-install-recommends bzr-builddeb ubuntu-dev-tools fakeroot build-essential gnupg pbuilder debhelper</code></p>
<p>This will give you a bunch of tools that are going to be useful, generally (not just in these examples)</p>
<p><code>bzr-builddeb</code> pulls in bzr, which is the short for <a href="http://en.wikipedia.org/wiki/Bazaar_%28software%29">Bazaar</a> which we&#8217;ll use to get the source-codes for one or two examples.<br />
<code>ubuntu-dev-tools</code> pulls in devscripts which are incredibly helpful at making repetitive packaging tasks easy<br />
<code>fakeroot</code> is needed by debuild (in devscripts) to mimic root privileges when installing files into a package<br />
<code>build-essential</code> pulls in lots of useful very basic build tools like gcc, make, etc<br />
<code>gnupg</code> is used to sign files in our case (uploads in the future)<br />
<code>pbuilder</code> is a build tool that builds source in a sane, clean and minimal environment it sets up itself<br />
<code>debhelper</code> contains scripts that automate lots of the build process in a package</p>
<p>Let&#8217;s first set up our <em>gpg key </em>which we use to sign files for upload. More generally it is used to sign and encrypt mails, files or general text. We use it to indicate that WE were the last to touch a file and not somebody else and that ensures that only people who we know about get to upload packages.</p>
<p>If you have no gpg-key yet, run the following command :</p>
<p><code>gpg --gen-key</code></p>
<p>When you are doing this, it is completely fine if you stick to the defaults. You don&#8217;t have to comment, for example. Enter your name, email address and just stick to the default values for now. It could be that gpg is still sitting there and waiting for  more random data to generate your key &#8211; that&#8217;s expected and fine. Just open another terminal while we carry on, it&#8217;ll finish on its own. As said earlier: if you have a gpg key already, skip this step</p>
<p><em>If you need more information on gpg-keys you can look through the<a href="https://help.ubuntu.com/community/GnuPrivacyGuardHowto"> Gnu Privacy Guide</a> which about everything in more detail.</em></p>
<p>Now we can set-up <a href="http://en.wikipedia.org/wiki/Pbuilder#Isolated_build_environments">pbuilder</a>.</p>
<p>Open an editor and edit the file ~/.pbuilderrc (create it if you don&#8217;t have it yet) and add the following content to the file:</p>
<p>COMPONENTS=&#8221;main universe multiverse restricted&#8221;</p>
<p>Save it and once you&#8217;re done, run the following command :</p>
<p><code>sudo pbuilder create</code></p>
<p><strong>What does pbuilder do?</strong></p>
<p>It builds packages in a clean and minimal environment.<br />
It keeps your system &#8220;clean&#8221; (so you don&#8217;t install millions of build dependencies on your own system).<br />
It makes sure the package builds in a minimal, unmodified environment so you ensure that the package does not just build because you made lots of changes on your system, but the build is reproducable</p>
<p>You can update package lists (later on) with:</p>
<p><code>sudo pbuilder update</code></p>
<p>To build packages you run:</p>
<p><code>sudo pbuilder build package_version.dsc</code></p>
<p><em>How does pbuilder work?</em> It first gets the minimal packages for a base system and stores them in a tarball. Whenever you build a package it&#8217;ll untar the base tarball, then install whatever your current build requires, build it, then tear it all down again. Luckily it caches the packages.</p>
<p>Something else we can do in the meantime is if you use the bash shell, which is the default, please edit ~/.bashrc. At the end of it, add something like<br />
DEBFULLNAME=&#8221;Your Name&#8221;<br />
DEBEMAIL=&#8221;your.mail.id@provider.com&#8221;<br />
Once you&#8217;re done editing ~/.bashrc, please run   source ~/. bashrc  (it&#8217;s only needed once)</p>
<p>QUESTION: Should these match the values we put into GPG?<br />
A: Yes</p>
<p>Ok, with this out of the way, the packaging tools will know you by your name and you don&#8217;t need to enter it, for example if you do changelog entries, etc.</p>
<blockquote><p><em>(aside)<strong> General Fundas:</strong></em></p>
<p><em>Ubuntu is very special in how it&#8217;s produced and how we all work. As you know it comes out every 6 months and that means we have a tight release schedule and everything we do and work on is defined by that schedule. Check <a href="https://wiki.ubuntu.com/MaverickReleaseSchedule">this</a> out for the current release schedule for maverick.</em></p>
<p><em>In that link, basically, green means: lots is allowed here, red means: almost nothing is allowed here.</em></p>
<p><em>In more detail,</em></p>
<p><em>- toolchain is uploaded for the new release (gcc, binutils, libc, etc.), so the most basic build tools are there<br />
- new changes that happened in the meantime are synced or merged (more on that later on)<br />
- ubuntu developer summit (uds) happens where features are defined and talked about<br />
- up until debian import freeze we import source changes from debian semi-automatically<br />
- up until feature freeze we get new stuff in, work on features, try to make it all work<br />
- if a feature is not half-way there yet by feature freeze, it will likely get deferred to the next release<br />
- from feature freeze on you can see that lots of freezes are added throughout the weeks and you&#8217;ll need more and more freeze exceptions for big changes</em></p>
<p><em>The focus is clearly: testing, testing, testing and fixing, fixing, fixing</em></p></blockquote>
<blockquote><p><em>So How do you get stuff in?<br />
Only people who &#8220;we know&#8221; get to upload packages directly, as we said before. This means that as a new contributor you will have to work with sponsors who basically review your work and upload it for you. Once you did that a couple of times and they recognise you and your good work, you can apply for ubuntu developer membership  and you can ask the people you&#8217;ve worked with for comments on your application</em></p>
<p><em>It&#8217;s not very complicated, you basically set up a wiki page with your contributions, ask for comments and submit for an #IRC meeting of the developer membership board and you&#8217;re done. No need to learn a secret handshake, send me money or anything else. It&#8217;s contributions and good work that counts.</em></p></blockquote>
<p>Now get the source of the hello package and run the command:<em><br />
</em><br />
<code>apt-get source hello<br />
sudo pbuilder build hello_*.dsc</code></p>
<p>Now, pbuilder will work as explained earlier.<br />
<em><strong>Note</strong> : .dsc has some metadata like checksums and the like. Not important right now.</em></p>
<p>You can check out the contents of /var/cache/pbuilder/result and it will contain the built hello.deb file.</p>
<p><em><strong>End of Part 1</strong> &#8211; There is no logical end here, but I don&#8217;t like very (extra) long posts. </em></p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/technicallypossible.wordpress.com/407/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/technicallypossible.wordpress.com/407/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/technicallypossible.wordpress.com/407/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/technicallypossible.wordpress.com/407/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/technicallypossible.wordpress.com/407/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/technicallypossible.wordpress.com/407/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/technicallypossible.wordpress.com/407/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/technicallypossible.wordpress.com/407/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/technicallypossible.wordpress.com/407/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/technicallypossible.wordpress.com/407/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/technicallypossible.wordpress.com/407/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/technicallypossible.wordpress.com/407/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/technicallypossible.wordpress.com/407/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/technicallypossible.wordpress.com/407/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=technicallypossible.wordpress.com&amp;blog=10830423&amp;post=407&amp;subd=technicallypossible&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://technicallypossible.wordpress.com/2010/08/27/basics-of-ubuntu-development-part-1/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<georss:point>10.107640 76.351578</georss:point>
		<geo:lat>10.107640</geo:lat>
		<geo:long>76.351578</geo:long>
		<media:content url="http://0.gravatar.com/avatar/0012a796c32c781113a71f1cbff11480?s=96&#38;d=monsterid&#38;r=G" medium="image">
			<media:title type="html">Neel</media:title>
		</media:content>

		<media:content url="http://technicallypossible.files.wordpress.com/2010/07/ubuntu-logo11.jpg?w=300" medium="image">
			<media:title type="html">ubuntu-logo1</media:title>
		</media:content>
	</item>
		<item>
		<title>What is so Random about a Random number?</title>
		<link>http://technicallypossible.wordpress.com/2010/08/26/what-is-so-random-about-a-random-number/</link>
		<comments>http://technicallypossible.wordpress.com/2010/08/26/what-is-so-random-about-a-random-number/#comments</comments>
		<pubDate>Thu, 26 Aug 2010 08:35:33 +0000</pubDate>
		<dc:creator>Neel</dc:creator>
				<category><![CDATA[Theory]]></category>
		<category><![CDATA[number]]></category>
		<category><![CDATA[random]]></category>

		<guid isPermaLink="false">http://technicallypossible.wordpress.com/?p=437</guid>
		<description><![CDATA[Some time back I asked a question on Quora : How does a computer choose a Random Number? A number of interesting responses were there, which I&#8217;d like to share here. The Question Since a computer can&#8217;t choose a number truly at random (and neither can humans, technically), what is a random number in the [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=technicallypossible.wordpress.com&amp;blog=10830423&amp;post=437&amp;subd=technicallypossible&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p><img class="aligncenter" title="Fairly Random" src="http://imgs.xkcd.com/comics/random_number.png" alt="" width="400" height="144" /></p>
<p>Some time back I asked a question on <a href="http://www.quora.com">Quora</a> : <em><strong><a href="http://www.quora.com/How-does-a-computer-choose-a-random-number">How does a computer choose a Random Number? </a></strong></em></p>
<p>A number of interesting responses were there, which I&#8217;d like to share here.</p>
<p><strong>The Question</strong></p>
<p>Since a computer can&#8217;t choose a number truly at random (and neither can humans, technically), what is a  random number in the context of a computer. Since the computer  calculates the number, is it only &#8220;relatively random&#8221; i.e. to the user,  the number &#8216;seems&#8217; random when in fact it isn&#8217;t? I was wondering about whether a user can successfully predict which &#8220;random&#8221; number is going to be generated by the computer.</p>
<p><strong>The Answers</strong></p>
<p>I got a few ansers, the best and most explanatory of which was by <a href="http://www.quora.com/Kiat-Chuan-Tan"><em>Kiat Chuan Tan</em></a>.</p>
<blockquote><p><em>There are two main ways a computer can choose a &#8220;random&#8221; number: using a <strong>pseudo-random number generator (PRNG)</strong>, or using a <strong>hardware random number generator</strong>.</em></p>
<p><em>A <a href="http://en.wikipedia.org/wiki/Pseudorandom_number_generator"> pseudo-random number generator</a>, as the name suggests, isn&#8217;t truly  random. PRNGs typically use deterministic algorithms such as <a href="http://en.wikipedia.org/wiki/Lagged_Fibonacci_generator">lagged  Fibonacci generators</a> or the<a href="http://en.wikipedia.org/wiki/Mersenne_twister"> Mersenne twister</a>. These numbers generated by  these algorithms are completely determined by the start state, and will  eventually repeat. However, a good choice of parameters for the  algorithms will make the period sufficiently large for practical use  (2^19937 &#8211; 1 for the Mersenne twister).</em></p>
<p><em>A <a href="http://en.wikipedia.org/wiki/Hardware_random_number_generator">hardware random number  generator</a>, on the other hand, is theoretically supposed to be truly  random, in the sense that it is based on noise generated by physical  processes, e.g. sampling of ambient noise from a sound card.</em></p></blockquote>
<p><a href="http://www.quora.com/Michael-Hamburg">Micheal Hamburg</a> added that:</p>
<blockquote><p><em>There are some generators that (unlike the Mersenne Twister) are  believed to be <a href="http://en.wikipedia.org/wiki/Cryptographically_secure_pseudorandom_number_generator">cryptographically pseudorandom</a>.  That is, if they start  with a small amount of randomness (say, 32 bytes), then there should be  no algorithm which, running on all the supercomputers in the world for a  thousand years, could notice a meaningful difference between the  &#8220;random&#8221; numbers they output and truly random numbers generated by a  hardware device.</em></p></blockquote>
<p>Now, I was wondering if I could actually find a service which could, essentially, guarantee me randomness. A source for random numbers which was truly random and unpredictable.</p>
<p><strong>Random.org</strong></p>
<p><a href="http://www.random.org/">Random.org</a> is an interesting service operated by <a href="http://www.scss.tcd.ie/Mads.Haahr/">Mads Haahr </a>of &lt;!&#8211; who is a <a href="http://en.wikipedia.org/wiki/Lecturer#United_Kingdom">Lecturer</a> in &#8211;&gt; the <a href="http://www.scss.tcd.ie/">School of Computer Science     and Statistics</a> at <a href="http://www.tcd.ie/">Trinity College,     Dublin</a> in Ireland.</p>
<p>So how do they generate their random numbers?</p>
<blockquote><p><em><img class="alignright" title="Random.org" src="http://userlogos.org/files/logos/Mafia_Penguin/Random.png" alt="" width="272" height="204" />RANDOM.ORG uses radio receivers to pick up atmospheric noise,   which is then used to generate random numbers.  The radios are tuned   between stations.  A possible attack on the generator is therefore   to broadcast on the frequencies that the RANDOM.ORG radios use in   order to affect the generator.  However, radio frequency attacks of   this type would be difficult for a variety of reasons.  First, the   frequencies that the radios use are not published, so an attacker   would have to broadcast across all frequencies of all bands used for   <a href="http://en.wikipedia.org/wiki/FM_broadcasting">FM</a><a href="http://en.wikipedia.org/wiki/AM_broadcasting">AM</a> broadcasting.  Second, this is not an attack that can be launched   from anywhere in the world, only reasonably close to the generator.   RANDOM.ORG currently has radio receivers in several different   countries, which would make it difficult to coordinate this type of   attack.  Third, if an attacker actually did succeed at broadcasting   highly regular signals (e.g., perfect sine waves) at exactly the   right frequencies from the right locations, then the RANDOM.ORG <a href="http://www.random.org/statistics/">real-time statistics</a> would pick up the drop   in quality very rapidly.  In particular, the <a href="http://www.random.org/statistics/source-purity/">Source Purity</a> and <a href="http://www.random.org/statistics/information-entropy/">Information Entropy</a> tests would start failing dramatically, which would raise an   alert.</em></p></blockquote>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/technicallypossible.wordpress.com/437/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/technicallypossible.wordpress.com/437/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/technicallypossible.wordpress.com/437/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/technicallypossible.wordpress.com/437/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/technicallypossible.wordpress.com/437/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/technicallypossible.wordpress.com/437/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/technicallypossible.wordpress.com/437/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/technicallypossible.wordpress.com/437/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/technicallypossible.wordpress.com/437/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/technicallypossible.wordpress.com/437/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/technicallypossible.wordpress.com/437/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/technicallypossible.wordpress.com/437/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/technicallypossible.wordpress.com/437/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/technicallypossible.wordpress.com/437/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=technicallypossible.wordpress.com&amp;blog=10830423&amp;post=437&amp;subd=technicallypossible&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://technicallypossible.wordpress.com/2010/08/26/what-is-so-random-about-a-random-number/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		<georss:point>10.107640 76.351578</georss:point>
		<geo:lat>10.107640</geo:lat>
		<geo:long>76.351578</geo:long>
		<media:content url="http://0.gravatar.com/avatar/0012a796c32c781113a71f1cbff11480?s=96&#38;d=monsterid&#38;r=G" medium="image">
			<media:title type="html">Neel</media:title>
		</media:content>

		<media:content url="http://imgs.xkcd.com/comics/random_number.png" medium="image">
			<media:title type="html">Fairly Random</media:title>
		</media:content>

		<media:content url="http://userlogos.org/files/logos/Mafia_Penguin/Random.png" medium="image">
			<media:title type="html">Random.org</media:title>
		</media:content>
	</item>
		<item>
		<title>Desktop Search on Ubuntu</title>
		<link>http://technicallypossible.wordpress.com/2010/08/25/desktop-search-on-ubuntu/</link>
		<comments>http://technicallypossible.wordpress.com/2010/08/25/desktop-search-on-ubuntu/#comments</comments>
		<pubDate>Wed, 25 Aug 2010 20:14:36 +0000</pubDate>
		<dc:creator>Neel</dc:creator>
				<category><![CDATA[Tech]]></category>
		<category><![CDATA[desktop search]]></category>
		<category><![CDATA[linux]]></category>
		<category><![CDATA[ubuntu]]></category>

		<guid isPermaLink="false">http://technicallypossible.wordpress.com/?p=450</guid>
		<description><![CDATA[I&#8217;m amazed that Ubuntu doesn&#8217;t have a good default Desktop Search tool. I&#8217;ve been using Ubuntu for a while now and the lack of a perfect desktop search is felt very strongly. The need for such a tool is greater in Linux than in Windows considering you have a greater degree of freedom in configuring [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=technicallypossible.wordpress.com&amp;blog=10830423&amp;post=450&amp;subd=technicallypossible&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p><img class="alignleft" title="Ubuntu" src="http://t2.gstatic.com/images?q=tbn:ujcGZQra1Ftj0M:http://www.go2linux.org/pics/single_pictures/Ubuntu-logo.jpg&amp;t=1" alt="" width="102" height="102" />I&#8217;m amazed that <a href="http://www.ubuntu.com/">Ubuntu</a> doesn&#8217;t have a good default Desktop Search tool. I&#8217;ve been using Ubuntu for a while now and the lack of a perfect desktop search is felt very strongly.</p>
<p><img class="alignright" title="Linux" src="http://www.unixmen.com/images/stories/linuxlogos/linux-logo.png" alt="" width="128" height="128" />The need for such a tool is greater in Linux than in Windows considering you have a greater degree of freedom in configuring obscure files to make the OS perform the way you want it to. From that perspective, it has become quite a pain searching for files of something I had installed a long time ago. Also, I&#8217;ve been reading a few research papers of late and I&#8217;ve been making some notes in txt. ReFinding something is a nightmare if you can&#8217;t remember the filename.</p>
<p>So I started looking into the different desktop search options I had.</p>
<p><strong>Beagle</strong></p>
<p><img class="alignright" title="Beagle Project" src="http://www.noulakaz.net/weblog/images/20060409-beagle-logo.png" alt="" width="262" height="134" /><a href="http://beagle-project.org/Main_Page">Beagle</a> is widely considered to be the leading tool for desktop search on Ubuntu. Its written in C# and runs using Mono. It uses<a href="http://lucene.apache.org/lucene.net/"> Lucene&#8217;s C# implementation </a>for indexing and it seems to be good. It was launched with (a bit of) fan-fare back at the<a href="http://2004.guadec.org/"> 2004 Guadec Conference</a> in Norway. It is quite similar to <a href="http://www.apple.com/downloads/macosx/networking_security/searchlight.html">Apple Searchlight</a> and was, interestingly, announced on the same day as Searchlight. Norway being 6+ hours ahead of the United States, it was technically released earlier!</p>
<p>But Beagle faced performance issues regarding its &#8216;crawling&#8217; process which was very processor heavy. Users grew irritated with the constant processing involved in keeping a fresh index that many disabled the tool. Some time ago <a href="http://mail.gnome.org/archives/dashboard-hackers/2010-January/msg00001.html"><strong>development on the Beagle Project was stopped</strong></a> and so, I believe, did the adoption by new users.</p>
<p>As a product, though, Beagle is quite good. It has a rich feature set and if you ignore the index refresh issues, its quite good at retrieval. It indexes all imaginable file types and has some interesting features, like the support of <a href="http://en.wikipedia.org/wiki/Glob_%28programming%29"><em>glob expressions</em></a> which I haven&#8217;t found in other search engines (except <a href="http://www.lesbonscomptes.com/recoll/">Recoll</a>).</p>
<p><strong>Tracker</strong></p>
<p><img class="alignright" title="metaTracker" src="http://projects.gnome.org/tracker/images/meta_tracker_logo.png" alt="" width="491" height="93" />Tracker is short for <a href="http://projects.gnome.org/tracker">Meta Tracker</a>. It is a light and fast desktop search implemented using C. Tracker&#8217;s lightness has meant that it doesn&#8217;t index the innards of some types of files &#8211; pdf, ppt, mailboxes etc. But it is faster and doesn&#8217;t eat up processor cycles, so its a more quick and dirty solution. And of course if you can&#8217;t find something using it, you can always switch over to good ol&#8217;<code> grep</code>.</p>
<p>Earlier, documentation was very poor on Tracker, but it has improved, I feel.</p>
<p><strong><img class="alignright" title="Google Desktop Search" src="http://t1.gstatic.com/images?q=tbn:wCKJ8gx9wh2GsM:http://blog.therealdavidfield.com/storage/Google-propune-companiilor-Google-Desktop-Search-Enterprise-2.jpg?__SQUARESPACE_CACHEVERSION=1262212154887&amp;t=1" alt="" width="176" height="176" />Google Desktop Search<br />
</strong></p>
<p><a href="http://desktop.google.com/linux/">Google Desktop Search</a> isn&#8217;t open source, so it doesn&#8217;t get brownie points. But it is quite useful.  It indexes almost all file-types including pdf and ppt. The engineering behind it is good and it doesn&#8217;t hog (too much) processor time. If you can get over the fact that Google will get even more of your information, then this is the desktop search tool for you. I prefer to have at least a bit of my own life to myself, so I&#8217;m left with very few options.</p>
<p>There are many other tools which I haven&#8217;t mentioned here.</p>
<p><a href="http://terrier.org/">Terrier</a> is one which I haven&#8217;t tested out in detail yet. I will be looking through the code of this, though, <a href="http://technicallypossible.wordpress.com/2010/05/23/future-plans-information-retrieval/">for obvious reasons.</a> It isn&#8217;t built with desktop search in mind, but might perform well.</p>
<p><a href="http://www.lesbonscomptes.com/recoll/">Recoll </a>is one which I haven&#8217;t tried out fully yet. It runs with a <a href="http://xapian.org/">Xapian</a> back-end of which I know very little of.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/technicallypossible.wordpress.com/450/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/technicallypossible.wordpress.com/450/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/technicallypossible.wordpress.com/450/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/technicallypossible.wordpress.com/450/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/technicallypossible.wordpress.com/450/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/technicallypossible.wordpress.com/450/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/technicallypossible.wordpress.com/450/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/technicallypossible.wordpress.com/450/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/technicallypossible.wordpress.com/450/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/technicallypossible.wordpress.com/450/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/technicallypossible.wordpress.com/450/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/technicallypossible.wordpress.com/450/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/technicallypossible.wordpress.com/450/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/technicallypossible.wordpress.com/450/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=technicallypossible.wordpress.com&amp;blog=10830423&amp;post=450&amp;subd=technicallypossible&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://technicallypossible.wordpress.com/2010/08/25/desktop-search-on-ubuntu/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		<georss:point>10.107640 76.351578</georss:point>
		<geo:lat>10.107640</geo:lat>
		<geo:long>76.351578</geo:long>
		<media:content url="http://0.gravatar.com/avatar/0012a796c32c781113a71f1cbff11480?s=96&#38;d=monsterid&#38;r=G" medium="image">
			<media:title type="html">Neel</media:title>
		</media:content>

		<media:content url="http://t2.gstatic.com/images?q=tbn:ujcGZQra1Ftj0M:http://www.go2linux.org/pics/single_pictures/Ubuntu-logo.jpg&#038;t=1" medium="image">
			<media:title type="html">Ubuntu</media:title>
		</media:content>

		<media:content url="http://www.unixmen.com/images/stories/linuxlogos/linux-logo.png" medium="image">
			<media:title type="html">Linux</media:title>
		</media:content>

		<media:content url="http://www.noulakaz.net/weblog/images/20060409-beagle-logo.png" medium="image">
			<media:title type="html">Beagle Project</media:title>
		</media:content>

		<media:content url="http://projects.gnome.org/tracker/images/meta_tracker_logo.png" medium="image">
			<media:title type="html">metaTracker</media:title>
		</media:content>

		<media:content url="http://t1.gstatic.com/images?q=tbn:wCKJ8gx9wh2GsM:http://blog.therealdavidfield.com/storage/Google-propune-companiilor-Google-Desktop-Search-Enterprise-2.jpg?__SQUARESPACE_CACHEVERSION=1262212154887&#038;t=1" medium="image">
			<media:title type="html">Google Desktop Search</media:title>
		</media:content>
	</item>
	</channel>
</rss>
