Human Being | BI & Social Media enthusiast | Voracious Reader | Amateur PhotographerThe Beautiful MindTumblr (3.0; @nellaikanth)http://nellaikanth.tumblr.com/What can I learn right now in just 10 minutes that could be useful for the rest of my life?<p>Answer by Lee Jenkinson:</p><blockquote>Be honest, tell the truth, or say nothing at all. Constant lying requires an excellent memory and is ultimately self defeating. Cheating only diminishes you as a person.<br/>Read “Desiderata” by Max Ehrmann.<br/>Make eye contact.<br/>Live the Golden Rule.<br/>Don’t be afraid to take chances and make mistakes.<br/>Let go of anger and hatred, it is corrosive.<br/>Forgive, but never forget.</blockquote><span class="qlink_container"><a href="http://www.quora.com/Tips-and-Hacks-for-Everyday-Life/What-can-I-learn-right-now-in-just-10-minutes-that-could-be-useful-for-the-rest-of-my-life/answer/Lee-Jenkinson">View Answer on Quora</a></span>http://nellaikanth.tumblr.com/post/90290188301http://nellaikanth.tumblr.com/post/90290188301Sun, 29 Jun 2014 18:10:46 -0400What are the most popular computer programming jokes?<p>Answer by Saurabh Gaur:</p><blockquote><div><img class="portrait qtext_image zoomable_in_feed" src="http://qph.is.quoracdn.net/main-qimg-7380d3c4f8c98ed6458e9a54d88415e0?convert_to_webp=true" master_src="http://qph.is.quoracdn.net/main-qimg-7380d3c4f8c98ed6458e9a54d88415e0?convert_to_webp=true" master_w="480" master_h="960"/></div></blockquote><span class="qlink_container"><a href="http://www.quora.com/Jokes/What-are-the-most-popular-computer-programming-jokes/answer/Saurabh-Gaur-13">View Answer on Quora</a></span>http://nellaikanth.tumblr.com/post/90249416001http://nellaikanth.tumblr.com/post/90249416001Sun, 29 Jun 2014 09:33:16 -0400What are the disadvantages of using Node.js?<p>Answer by Alexander Gugel:</p><blockquote><ul><li><span class="qlink_container"><a href="http://callbackhell.com/" rel="nofollow" class="external_link" target="_blank" onmouseover='return require("qtext").tooltip(this, "callbackhell.com")'>Callback Hell</a></span>: Everything is asynchronous by default. This means you are likely to end up using tons of nested callbacks. Nevertheless, there are a number of solutions to this problem, e.g. <span class="qlink_container"><a href="https://github.com/caolan/async" class="external_link" target="_blank" onmouseover='return require("qtext").tooltip(this, "github.com")'>caolan/async</a></span> or <span class="qlink_container"><a href="http://stackoverflow.com/questions/4296505/understanding-promises-in-node-js" class="external_link" target="_blank" onmouseover='return require("qtext").tooltip(this, "stackoverflow.com")'>Understanding promises in node.js</a></span>. This problem is specific to JavaScript, not Node.JS.</li></ul><ul><li>Single-threaded: Node.js is single-threaded. You can take advantage of multiple CPUs, but in general everything is designed to use the Event-Loop in order to achieve extraordinary performance. This can also be an advantage, since e.g write conflicts on files aren’t that relevant.</li></ul><ul><li>JavaScript: There is a reason why is it called Node.<b>JS</b>. JavaScript has been designed in 10 days and that’s partly obvious. I really like JS, but there are some obvious drawbacks (but every programming language sucks, just read <span class="qlink_container"><a href="https://wiki.theory.org/YourLanguageSucks#JavaScript_sucks_because" rel="nofollow" class="external_link" target="_blank" onmouseover='return require("qtext").tooltip(this, "theory.org")'>YourLanguageSucks - Theory.org Wiki</a></span>). Prototypal instantiation wasn’t originally planned to be integrated. Prototypes are much more elegant, but since Netscape wanted to jump onto the Java-train, they wanted to introduce the <i>new</i> keyword.</li><li>Event Loop: The Event Loop is the core of Node.js and it’s a genius idea. But: Don’t use Node.js for blocking, CPU-intensive tasks. Node.js is not suited for stuff like that. Node.js is suited for I/O stuff (like web servers).</li></ul><br/><b>TL;DR</b><br/>Don’t use Node.js for CPU-intensive tasks.<br/>Node.js rocks for servers.<br/>JavaScript partly sucks. Use <span class="qlink_container"><a href="http://coffeescript.org/" rel="nofollow" class="external_link" target="_blank" onmouseover='return require("qtext").tooltip(this, "coffeescript.org")'>CoffeeScript</a></span> in order to solve this problem. <br/><br/>Node.js certainly has some disadvantages, but it is currently one of the best tools out there in order to create asynchronous, non-blocking apps. It’s great.</blockquote><span class="qlink_container"><a href="http://www.quora.com/What-are-the-disadvantages-of-using-Node-js/answer/Alexander-Gugel">View Answer on Quora</a></span>http://nellaikanth.tumblr.com/post/90246998371http://nellaikanth.tumblr.com/post/90246998371Sun, 29 Jun 2014 08:47:47 -0400Yahoo Betting on Apache Hive, Tez, and YARN<a href="http://yahoodevelopers.tumblr.com/post/85930551108/yahoo-betting-on-apache-hive-tez-and-yarn">Yahoo Betting on Apache Hive, Tez, and YARN</a>: <p><a href="http://yahoodevelopers.tumblr.com/post/85930551108/yahoo-betting-on-apache-hive-tez-and-yarn" class="tumblr_blog">yahoodevelopers</a>:</p>
<blockquote>
<p><em>by The Hadoop Platforms Team </em></p>
<p>Low-latency SQL queries, Business Intelligence (BI), and Data Discovery on Big Data are some of the hottest topics these days in the industry with a range of solutions coming to life lately to address them as either proprietary or open-source implementations on…</p></blockquote>http://nellaikanth.tumblr.com/post/86106474706http://nellaikanth.tumblr.com/post/86106474706Sun, 18 May 2014 09:58:25 -0400What are the best Python scripts you've ever written?<p>Answer by Abhijit Agarwal:</p><blockquote><h2>A script to get rank and fees information about CS master’s courses in Europe </h2><h2><br/></h2>I was looking into the idea of pursuing my masters in Computer Science from Europe. I found a <span class="qlink_container"><a href="http://mastersportal.eu" rel="nofollow" class="external_link" target="_blank" onmouseover='return require("qtext").tooltip(this, "mastersportal.eu")'>website</a></span> where you could view master courses in different subjects and countries all over Europe.<br/><br/>But the fees is different for students within the EU and students outside of the EU, and the latter was not always included. Moreover, there was no way to see how good a college is considered, because there was no ranking on the page.<br/><br/>So I wrote a script to scrape the results on the website and find it’s fees for outsiders and the <span class="qlink_container"><a href="http://www.topuniversities.com/university-rankings" rel="nofollow" class="external_link" target="_blank" onmouseover='return require("qtext").tooltip(this, "topuniversities.com")'>QS</a></span> and <span class="qlink_container"><a href="http://www.shanghairanking.com/" rel="nofollow" class="external_link" target="_blank" onmouseover='return require("qtext").tooltip(this, "shanghairanking.com")'>ARWU</a></span> rankings of the University and write all this data into a CSV file.<br/><br/>This script also made the use of Wolfram Alpha API and the Google Search engine. I used BeautifulSoup library for scraping and LXML library for parsing XML outputs by <span class="qlink_container"><a href="http://wolframalpha.com" rel="nofollow" class="external_link" target="_blank" onmouseover='return require("qtext").tooltip(this, "wolframalpha.com")'>Wolfram Alpha</a></span> <br/><br/>Here is a sample output for the keyword “Graphic Design”<br/><div><img class="landscape qtext_image zoomable_in zoomable_in_feed" src="http://qph.is.quoracdn.net/main-qimg-c63898863b595fde4cb5993c1a2a1743?convert_to_webp=true" master_src="http://qph.is.quoracdn.net/main-qimg-26bd60ccf1a15fd945e3907c57f16ae8?convert_to_webp=true" master_w="1364" master_h="493"/></div><br/>It is definitely not my best script but a problem that interested me a lot at the time.<br/>You can look at the code at <span class="qlink_container"><a href="https://github.com/abhijit148/masterSearch/tree/develop" class="external_link" target="_blank" onmouseover='return require("qtext").tooltip(this, "github.com")'>abhijit148/masterSearch</a></span><br/>Please note that even though the code works last time I checked, it is nowhere near what we call “Good Quality” code and I request that you not judge me by how messy it is. :)<br/>You are welcome to contribute to the development of this code and if you do it independently, send me a link when you are done maybe?</blockquote><span class="qlink_container"><a href="http://www.quora.com/Python-programming-language-1/What-are-the-best-Python-scripts-youve-ever-written/answer/Abhijit-Agarwal-1">View Answer on Quora</a></span>http://nellaikanth.tumblr.com/post/83316580880http://nellaikanth.tumblr.com/post/83316580880Sun, 20 Apr 2014 12:59:19 -0400What are the best day-to-day time-saving hacks?<p>Answer by Charudutt Wasnikar:</p><blockquote>1. Prioritize<br/>2. Delete unwanted mails / folders on desk<br/>3. Have a Beginning of Day List<br/>4. Have a End of the Day List : Unless you wont have BOD and EOD list, agenda wont be set<br/>5. Have 20% time set for adhoc tasks. If there are no adhic tasks, one can just spend time talking with colleagues, teams, bosses<br/>6. Mark mails as important or not important. Keep a timeline by when you want to complete task and reply the mails<br/>7. Use Diary / Notebook</blockquote><span class="qlink_container"><a href="http://www.quora.com/Productivity/What-are-the-best-day-to-day-time-saving-hacks/answer/Charudutt-Wasnikar">View Answer on Quora</a></span>http://nellaikanth.tumblr.com/post/83316315198http://nellaikanth.tumblr.com/post/83316315198Sun, 20 Apr 2014 12:56:14 -0400Sum it Up<p><a class="tumblr_blog" href="http://www.techinterview.org/post/526329049/sum-it-up">techinterview</a>:</p>
<blockquote>
<p>Problem: you are given a sequence of numbers from 1 to n-1 with one of the numbers repeating only once. (example: 1 2 3 3 4 5). how can you find the repeating number? what if i give you the constraint that you can’t use a dynamic amount of memory (i.e. the amount of memory you use can’t be related to n)? <br/>what if there are two repeating numbers (and the same memory constraint?)</p>
<p><a href="http://www.techinterview.org/post/526329049/sum-it-up">Read More</a></p>
</blockquote>http://nellaikanth.tumblr.com/post/74778426890http://nellaikanth.tumblr.com/post/74778426890Mon, 27 Jan 2014 19:48:07 -0500Road to Data Science<p>Post by Adriano Stephan:</p><blockquote>Sort of like the Paris metro map for Machine Learning :-) see the full-blown version here: <a href="https://qph.is.quoracdn.n">https://qph.is.quoracdn.n</a><wbr></wbr>et/main-qimg-3467d861335e<wbr></wbr>b3a58067278fd83ca2ce</blockquote><a href="http://compute.quora.com/Where-can-I-find-a-good-road-map-for-Data-Science-and-Machine-Learning?srid=huKI&share=1">View Post on Quora</a>http://nellaikanth.tumblr.com/post/69221121146http://nellaikanth.tumblr.com/post/69221121146Fri, 06 Dec 2013 20:19:51 -0500Scalable machine learning :: video lectures by Alex Smola<p>Post by Adriano Stephan:</p><blockquote>Scalable machine learning :: video lectures by Alex Smola</blockquote><a href="http://compute.quora.com/Scalable-machine-learning-video-lectures-by-Alex-Smola?srid=huKI&share=1">View Post on Quora</a>http://nellaikanth.tumblr.com/post/69221092523http://nellaikanth.tumblr.com/post/69221092523Fri, 06 Dec 2013 20:19:30 -0500How can I use Nate Silver's methods to accurately predict future events?<p>Answer by Jay Wacker:</p><blockquote>1.) gather an immense amount of data<br/>2.) perform relatively simple statistical analyses <br/>3.) add expert knowledge to fix naive use of statistics<br/><br/>The only thing that Nate Silver (and a dozen or so different groups) showed is that the polls were accurate and treating them in a straightforward honest manner gives a more accurate answer than any single poll. <br/><br/>To be fair, Nate Silver corrected for numerous subtleties of the polls having to do with systematic bias, whether possibly intentional or not. They deweighted polls that were historically off or systematically disagreed with other polls. However, all of this expert knowledge corrections they introduced was independently created by several other groups, showing that standard practices gave the right result.<br/><br/>The key thing about taking this method forward is that without the polls (he had <i>hundreds</i> of polls, each of which with thousands of properly sampled and corrected for demographic populations), Nate Silver had exactly bumpkis, which he would say as well. The polls were his data. Getting the data is expensive and hard. Analyzing it correctly, if not easy, is something that hundreds of thousands of scientists are trained in doing. <br/><br/>So if you have extensive polling, I’d advise averaging them together combining them weighting them inversely proportional to their error. If you have historical information, identify ways that the polls systematically skew and correct for that. However, these events are few and far between.</blockquote><span class="qlink_container"><a href="http://www.quora.com/Nate-Silver/How-can-I-use-Nate-Silvers-methods-to-accurately-predict-future-events/answer/Jay-Wacker">View Answer on Quora</a></span>http://nellaikanth.tumblr.com/post/69221064032http://nellaikanth.tumblr.com/post/69221064032Fri, 06 Dec 2013 20:19:10 -0500What is a simplified explanation and proof of the Johnson-Lindenstrauss lemma?<p>Answer by Alex Clemmer:</p><blockquote><b>Why are random projections probably nearly as effective as carefully designed ones?</b><br/><br/>This question you have asked in the “details” section is the key question of the JL lemma. And the unfortunate truth is that they <i>aren’t</i>. Or rather, they are, but only when your data are all over the place.<br/><br/>To see why, consider this toy example I drew. (Note that this example was picked because we can visualize 3 dimensions, and not because the math of the JL lemma is useful here! It is for intuition <i>only</i>.)<br/><div><img class="landscape qtext_image zoomable_in zoomable_in_feed" src="http://qph.cf.quoracdn.net/main-qimg-16650ea96135fe51c614c34f6a8ec419" master_src="http://qph.cf.quoracdn.net/main-qimg-2ef9a1190b95b431316c3782c4fafc43" master_w="5143" master_h="2185"/></div><br/>This brings us to our first point: in order to understand a “simplified” version of the JL lemma, we must understand:<br/><br/><h2>What is the JL lemma really saying about projections?<br/></h2>Intuitively, the JL lemma says is this: <b>if you pick a random subspace and project onto it, the scaled pairwise distances between points will </b><b><i>probably</i></b><b> be preserved</b>.<br/><br/>This will be true regardless of the pointset you have. But you’ll notice that in the example on the right, some planes seem to be “better” than others. In the pointset on the left, you could probably project onto any plane, and it would almost certainly be equally bad. But the data on the right seem to lie close to a plane, so intuitively, the planes “close” to the data seem to be “less bad.”<br/><br/>So on the one hand, the JL lemma is telling us that pairwise distances are probably not distorted. But on the other hand, geometry is telling us that some projections are “better” than others. And this mismatch tells us something interesting about random projections:<br/><ul><li><b>Pairwise distance does not tell us everything there is to know about dimensionality reduction</b>. The JL lemma by itself is not able to tell us why some projections are worst than others in the dataset on the right. All it tells us is that the scaled pairwise distance is not distorted too much.</li><li><b>But it’s still pretty useful</b>. For example, if you were running approximate nearest neighbors or something, then you could pick a random projection and reduce the dimension significantly, but still be mostly correct.<br/></li></ul><br/>So in some sense, the JL lemma seems to work because pairwise distances are not quite as important for dimensionality reduction as we might have hoped. Still, it is interesting that they would be this resilient to random projection, and it seems worth wondering why.<br/><br/><h2>A more formal definition of the JL lemma<br/></h2><br/>First I’ll state the JL lemma, and then demonstrate the intuition using a caricature of a proof, which should give you a good idea why it is true.<br/><br/><b><i>Proposition 1 (the JL lemma):</i></b><i> For some <span class="math_w"><img src="http://qlx.cf.quoracdn.net/main-e9921ebaaf46a9f5.png" width="138" height="23" class="math" type="math" title="k \ge O(\log m / \varepsilon^2)" alt="k \ge O(\log m / \varepsilon^2)"/></span> (where <span class="math_w"><img src="http://qlx.cf.quoracdn.net/main-d6a496cfd1f584ae.png" width="9" height="9" class="math" type="math" title="\varepsilon" alt="\varepsilon"/></span> is our chosen error tolerance), with high probability, map <span class="math_w"><img src="http://qlx.cf.quoracdn.net/main-d1714f620fea8dd4.png" width="103" height="22" class="math" type="math" title="f : \mathbb{R}^d \rightarrow \mathbb{R}^k" alt="f : \mathbb{R}^d \rightarrow \mathbb{R}^k"/></span> does not change the pairwise distance between any two points more than a factor of <span class="math_w"><img src="http://qlx.cf.quoracdn.net/main-9b0a535e56040c83.png" width="56" height="21" class="math" type="math" title="(1 \pm \varepsilon)" alt="(1 \pm \varepsilon)"/></span>, after scaling by <span class="math_w"><img src="http://qlx.cf.quoracdn.net/main-d5186f31ee65a679.png" width="49" height="31" class="math" type="math" title="\sqrt{n/k}" alt="\sqrt{n/k}"/></span>).</i><br/><br/>There are a couple of things that are good to notice about the JL lemma that might help you to know when it’s applicable to your problems.<br/><ul><li>It makes statements for reduction from high space to “medium” space. It does <i>not</i> really work in the same way for extremely small space, like <span class="math_w"><img src="http://qlx.cf.quoracdn.net/main-d6ebffa577cc509a.png" width="20" height="18" class="math" type="math" title="\mathbb{R}^1" alt="\mathbb{R}^1"/></span>.</li><li>According to the JL lemma, our choice of <span class="math_w"><img src="http://qlx.cf.quoracdn.net/main-3f5c31a281e6d581.png" width="9" height="14" class="math" type="math" title="k" alt="k"/></span> (recall we’re mapping from <span class="math_w"><img src="http://qlx.cf.quoracdn.net/main-18e8f5c824a1f38e.png" width="22" height="18" class="math" type="math" title="\mathbb{R}^d" alt="\mathbb{R}^d"/></span> to <span class="math_w"><img src="http://qlx.cf.quoracdn.net/main-2e8e6a314815f590.png" width="22" height="18" class="math" type="math" title="\mathbb{R}^k" alt="\mathbb{R}^k"/></span>) should depend <i>only</i> on the number of points <span class="math_w"><img src="http://qlx.cf.quoracdn.net/main-23b52887cf3721b1.png" width="17" height="9" class="math" type="math" title="m" alt="m"/></span> and our error tolerance <span class="math_w"><img src="http://qlx.cf.quoracdn.net/main-d6a496cfd1f584ae.png" width="9" height="9" class="math" type="math" title="\varepsilon" alt="\varepsilon"/></span>.<br/></li></ul><br/><h2>A Caricature of a Proof</h2>[<i>Note: </i>I believe this “caricature” is due to Avrim Blum, but I can’t find it, and so can’t be sure.]<br/><br/>The gist of the proof is this: <b>the JL lemma follows from the fact that the squared length of a vector is sharply concentrated around its mean when projected onto a random </b><span class="math_w"><img src="http://qlx.cf.quoracdn.net/main-3f5c31a281e6d581.png" width="9" height="14" class="math" type="math" title="k" alt="k"/></span><b>-dimensional subspace</b>. What we aim to show in this section is <b>why</b> this would even be true or useful for proving the JL lemma.<br/><br/>We will begin as something that doesn’t really resemble the JL lemma, but by the end it will hopefully become clear what the relationship is.<br/><br/>First, let’s say we randomly sample <span class="math_w"><img src="http://qlx.cf.quoracdn.net/main-23b52887cf3721b1.png" width="17" height="9" class="math" type="math" title="m" alt="m"/></span> points that lie on the surface of a <span class="math_w"><img src="http://qlx.cf.quoracdn.net/main-8cc4d1a08841dd25.png" width="11" height="14" class="math" type="math" title="d" alt="d"/></span>-dimensional sphere. These points can also be seen as random unit-length vectors.<br/><div><img class="landscape qtext_image zoomable_in zoomable_in_feed" src="http://qph.cf.quoracdn.net/main-qimg-7c80846236fe3381ea00a5266914e97f" master_src="http://qph.cf.quoracdn.net/main-qimg-e0b6180cc9d68d2f6be27072f2b7a486" master_w="5143" master_h="2185"/></div><br/>We would like to look at how individual coordinates behave. So we will simplify the picture somewhat.<br/><br/>Since they lie on a <span class="math_w"><img src="http://qlx.cf.quoracdn.net/main-8cc4d1a08841dd25.png" width="11" height="14" class="math" type="math" title="d" alt="d"/></span>-dimensional spere, they are points in <span class="math_w"><img src="http://qlx.cf.quoracdn.net/main-18e8f5c824a1f38e.png" width="22" height="18" class="math" type="math" title="\mathbb{R}^d" alt="\mathbb{R}^d"/></span>. But, notice that, regardless of the size of <span class="math_w"><img src="http://qlx.cf.quoracdn.net/main-8cc4d1a08841dd25.png" width="11" height="14" class="math" type="math" title="d" alt="d"/></span>, we can say that these points lie in an <span class="math_w"><img src="http://qlx.cf.quoracdn.net/main-23b52887cf3721b1.png" width="17" height="9" class="math" type="math" title="m" alt="m"/></span>-dimensional subspace, because there are only <span class="math_w"><img src="http://qlx.cf.quoracdn.net/main-23b52887cf3721b1.png" width="17" height="9" class="math" type="math" title="m" alt="m"/></span> points. So we’ll say that they lie in <span class="math_w"><img src="http://qlx.cf.quoracdn.net/main-623a95c72fd52f1a.png" width="27" height="15" class="math" type="math" title="\mathbb{R}^m" alt="\mathbb{R}^m"/></span>.<br/><br/>Let’s start by looking at the first coordinate. By standard probability theory, we know that <span class="math_w"><img src="http://qlx.cf.quoracdn.net/main-dfc2235904e7d140.png" width="106" height="24" class="math" type="math" title="\mathbb{E}[x^2_1] = 1/m" alt="\mathbb{E}[x^2_1] = 1/m"/></span>. Note that as <span class="math_w"><img src="http://qlx.cf.quoracdn.net/main-23b52887cf3721b1.png" width="17" height="9" class="math" type="math" title="m" alt="m"/></span> gets bigger, the concentration rises precipitously, meaning that the actual values of the coordinates will be sharply concentrated around this value.<br/><br/>Now we’d like to extend this intuition to look at all of the coordinates. Since the value of these coordinates will be sharply concentrated around <span class="math_w"><img src="http://qlx.cf.quoracdn.net/main-9ec7c2536ac6cccc.png" width="36" height="21" class="math" type="math" title="1/m" alt="1/m"/></span> for large <span class="math_w"><img src="http://qlx.cf.quoracdn.net/main-23b52887cf3721b1.png" width="17" height="9" class="math" type="math" title="m" alt="m"/></span>, we can see that the coordinates kind of sort of look iid. They’re not really, because if one coordinate is large, the others are necessarily small, but this “sharpness” means that they are “almost” iid. This is not a real proof, right? So let’s say they’re iid for illustrative purposes.<br/><br/>If they’re iid, then we can apply our favorite Chernoff/Hoeffding bound to say that, with really high probability, all the coordinates will be really really close to being <span class="math_w"><img src="http://qlx.cf.quoracdn.net/main-9ec7c2536ac6cccc.png" width="36" height="21" class="math" type="math" title="1/m" alt="1/m"/></span> in size. For example, <span class="math_w"><img src="http://qlx.cf.quoracdn.net/main-1eca919326c04c63.png" width="397" height="27" class="math" type="math" title="p(|(x_1^2 + \ldots + x_k^2) - k/n| \ge \varepsilon k/n] \le 1/(e^{O(k \varepsilon^2)} )" alt="p(|(x_1^2 + \ldots + x_k^2) - k/n| \ge \varepsilon k/n] \le 1/(e^{O(k \varepsilon^2)} )"/></span>. Remember, this is if they’re iid, which of course they aren’t, but they <i>kind of</i> look like they are. This is intuition.<br/><br/>At this point we’re ready to look at random projections. The basic idea is, we’re going to take a random plane and project our unit vectors onto it. Here’s are a series of “random” examples that I made up (note that the plane of projection is translated to the origin).<br/><div><img class="landscape qtext_image zoomable_in zoomable_in_feed" src="http://qph.cf.quoracdn.net/main-qimg-122532a1a32f220a66672ed4786974d7" master_src="http://qph.cf.quoracdn.net/main-qimg-be9299969e845d07fc731af1c1477476" master_w="5143" master_h="2185"/></div>But it turns out that we can look at vector projections in a more interesting way. Projecting from <span class="math_w"><img src="http://qlx.cf.quoracdn.net/main-623a95c72fd52f1a.png" width="27" height="15" class="math" type="math" title="\mathbb{R}^m" alt="\mathbb{R}^m"/></span> to <span class="math_w"><img src="http://qlx.cf.quoracdn.net/main-2e8e6a314815f590.png" width="22" height="18" class="math" type="math" title="\mathbb{R}^k" alt="\mathbb{R}^k"/></span> is basically the same thing as randomly rotating the vector, and then reading off the first <span class="math_w"><img src="http://qlx.cf.quoracdn.net/main-3f5c31a281e6d581.png" width="9" height="14" class="math" type="math" title="k" alt="k"/></span> coordinates.<br/><br/>Now we get to the JL lemma: <b>in order for the JL lemma to be basically true, the distance between the two vectors must be almost the same after we scale it accordingly!</b> In the picture above, the projected vectors each have a dotted line going between them, depicting the distance. So, in other words, that vector needs to be almost the same scaled length as it would be in the original example.<br/><br/>Unsurprisingly, it turns out that this is true. Here’s the math that justifies it.<br/><br/>Using the above Chernoff-Hoeffding bound, we see that at <span class="math_w"><img src="http://qlx.cf.quoracdn.net/main-eb39f5a9292af705.png" width="130" height="41" class="math" type="math" title="k = O(\frac{1}{\varepsilon^2} \log n)" alt="k = O(\frac{1}{\varepsilon^2} \log n)"/></span>, then with probability <span class="math_w"><img src="http://qlx.cf.quoracdn.net/main-1ae3816a4a8c6074.png" width="82" height="21" class="math" type="math" title="1 - O(n^p)" alt="1 - O(n^p)"/></span> (for almost whatever positive integer choice of <span class="math_w"><img src="http://qlx.cf.quoracdn.net/main-c410f6fcde1d3b60.png" width="11" height="13" class="math" type="math" title="p" alt="p"/></span> you like), the projection to the first <span class="math_w"><img src="http://qlx.cf.quoracdn.net/main-3f5c31a281e6d581.png" width="9" height="14" class="math" type="math" title="k" alt="k"/></span> coordinates has length <span class="math_w"><img src="http://qlx.cf.quoracdn.net/main-ae909dc162812dd3.png" width="119" height="31" class="math" type="math" title="\sqrt{k/n} \cdot (1 \pm \varepsilon)" alt="\sqrt{k/n} \cdot (1 \pm \varepsilon)"/></span>.<br/><br/>Now, let’s look at the distance between some pair of vectors. Let’s say our regular non-projected vectors are called <span class="math_w"><img src="http://qlx.cf.quoracdn.net/main-bb200822e1cd2643.png" width="15" height="19" class="math" type="math" title="\vec{v}_1" alt="\vec{v}_1"/></span> and <span class="math_w"><img src="http://qlx.cf.quoracdn.net/main-3dc4ab54ae24530d.png" width="16" height="18" class="math" type="math" title="\vec{v}_2" alt="\vec{v}_2"/></span>. Then the vector that represents the dotted line between the original vectors would be <span class="math_w"><img src="http://qlx.cf.quoracdn.net/main-858c2a1dd636dd94.png" width="93" height="25" class="math" type="math" title="\vec{d} = \vec{v}_2 - \vec{v}_1" alt="\vec{d} = \vec{v}_2 - \vec{v}_1"/></span>.<br/><br/>And here’s the punch line. We can take that “distance vector” that goes between our original vectors, and use the same projection argument as above. Thus, <b>with high probability (by the union bound), the length of all these “distance vectors” </b><span class="math_w"><img src="http://qlx.cf.quoracdn.net/main-bb9109f980140981.png" width="15" height="21" class="math" type="math" title="\vec{d}" alt="\vec{d}"/></span> <b>project to length</b> <span class="math_w"><img src="http://qlx.cf.quoracdn.net/main-5e5ecd800e8b51a3.png" width="144" height="51" class="math" type="math" title="\sqrt{\frac{k}{n}} \cdot (1 \pm \varepsilon) ||\vec{d}||_2" alt="\sqrt{\frac{k}{n}} \cdot (1 \pm \varepsilon) ||\vec{d}||_2"/></span>.<br/><br/><h2>Toward more general arguments.<br/></h2>So that’s the intuition for why the JL lemma is true. As we said before, the JL lemma follows from the fact that the squared length of a vector is sharply concentrated around its mean when projected onto a random <span class="math_w"><img src="http://qlx.cf.quoracdn.net/main-3f5c31a281e6d581.png" width="9" height="14" class="math" type="math" title="k" alt="k"/></span>-dimensional subspace. This last section, I hope, explains roughly both why this would help us prove the JL lemma, and is a beginning on why it’s true on a more general basis.<br/><br/>If you want to know more, I hope this arms you nicely to figuring out the proofs floating around out there. I would say that <span class="qlink_container"><a href="http://www.quora.com/Alexandre-Passos">Alexandre Passos</a></span>'s answer is a reasonable start. An excellent treatment exists in Foundations of Machine Learning, by Mohri <i>et al</i>. It’s an excellent book anyway, and you should read it just because.</blockquote><span class="qlink_container"><a href="http://www.quora.com/What-is-a-simplified-explanation-and-proof-of-the-Johnson-Lindenstrauss-lemma/answer/Alex-Clemmer">View Answer on Quora</a></span>http://nellaikanth.tumblr.com/post/69221033651http://nellaikanth.tumblr.com/post/69221033651Fri, 06 Dec 2013 20:18:50 -0500What are some good Undergraduate projects in Quantum Computing and algorithms?<p>Answer by Igor Markov:</p><blockquote>Figure out how Shor’s algorithm works and implement a simulator for it in Java or C++. Then try to reduce the size of the circuit by various modifications and see how this affects the result of the algorithm (there are some modifications that would only slightly degrade the performance).<br/><br/>Figure out how the stabilizer formalism (Heisenberg representation) works and implement a simulator for stabilizer circuits, following the work of Daniel Gottesman or his 2004 paper with Scott Aaronson. Find recent papers by Richard Jozsa on arxiv that propose a different (simpler) simulation technique, implement it, and compare empirically to the original algorithm in terms of runtime and memory usage.<br/><br/>Given a quantum circuit (either a circuit design or a physical device - different cases), how do you check that it does what it should do? If it misses one gate, can you find which gate is missing ? There isn’t much literature on this, but you can find a few reasonable papers.</blockquote><span class="qlink_container"><a href="http://www.quora.com/Quantum-Computation/What-are-some-good-Undergraduate-projects-in-Quantum-Computing-and-algorithms/answer/Igor-Markov">View Answer on Quora</a></span>http://nellaikanth.tumblr.com/post/69220942951http://nellaikanth.tumblr.com/post/69220942951Fri, 06 Dec 2013 20:17:45 -0500What is the most efficient way for a programmer to get good at algorithms without participating in programming competitions?<p>Answer by Igor Markov:</p><blockquote><b>First</b>, you need to be clear on the syntax of the language you are using. Even when you plan to focus on CS theory, it seems impractical to focus on algorithms w/o having a realistic language to express them (and use pseudocode only). When teaching algorithms with C++, I see a lot of students who don’t understand pointers, arrays, references, templates, etc in C++, and this limits what they can do in practice (i.e., in projects).<br/><br/><b>Second</b>, you need to understand standard algorithms.<br/>The Cormen-Leiserson-Rivest-S<wbr></wbr>tein book (3rd+ ed) is good for this (but does not explain how to implement anything in a particular language). Knuth’s TAOCP is the nuclear option for studying algorithms - even if you finish, it will take many years (also note that Knuth is still working on new volumes).<br/><br/><b>Third</b>, you need to understand how to implement and use standard algorithms with a given language, e.g., using the STL in C++. The book by Josuttis on the C++ standard library (2nd ed) is excellent. You may also get a book on algorithms that shows C++ implementations, such as the one by Weiss (3rd ed). There are other sources that use Java or C#. However, experience shows that students who went through a C++ based course can migrate to Java and C#, but not as easily the other way around.<br/><br/><b>Fourth</b>, practice algorithmic problem solving. This is harder, but the CLRS book has a number of great exercises. Solve them, implement them, and test your solutions. There are also books with algorithmic/programming problems, including interview questions. Should be easy to find on amazon. But becoming good at algorithmic problem-solving will take time.<br/><br/><b>Fifth</b>, algorithm analysis. The CLRS book is also good for this.<br/><br/><b>Sixth</b>, large-scale algorithm design and development. This comes with experience, including a lot of reading (research papers), trying out different ideas (research projects) and/or graduate studies. Probably beyond the scope of your question.<br/><br/><u>Update</u>: in teaching algorithms and data structures (e.g., in the current semester at the University of Michigan), an extremely efficient tool is an <i>autograder</i>. Students submit their project online and get the results of testing on hidden testcases within 10 minutes. This provides a feedback loop that is just not possible otherwise. Sometimes students cannot get the right output, sometimes their algorithms/programs are too slow. However, when they know just how good /poor their solution is, they can focus on improving their solution. If you are not taking a course at a university, you can get a similar environment in online programming competitions in training mode. Google Code Jam would be one of those.</blockquote><span class="qlink_container"><a href="http://www.quora.com/Computer-Programming/What-is-the-most-efficient-way-for-a-programmer-to-get-good-at-algorithms-without-participating-in-programming-competitions/answer/Igor-Markov">View Answer on Quora</a></span>http://nellaikanth.tumblr.com/post/69220894862http://nellaikanth.tumblr.com/post/69220894862Fri, 06 Dec 2013 20:17:11 -0500What data structures/matching algorithims does vimdiff use?<p>Answer by Anirudh Joshi:</p><blockquote>Diffs are part of the Longest common subsequence problem (<a href="http://en.wikipedia.org/">http://en.wikipedia.org/</a><wbr></wbr>wiki/Longest_common_subse<wbr></wbr>quence_problem). This is an NP-hard problem that basically tries to find continuous sequences in arbitrary data - i.e. many source files/DNA shot gun sequencing.<br/><br/>However diffs only operates on 2 sequences reducing the complexity of the search to O(m*n) with m/n being the length of the files (thanks to <span class="qlink_container"><a href="http://www.quora.com/Peter-Scott-3">Peter Scott</a></span>).<br/><br/>I’m pretty sure vimdiff uses the standard diff tool (<a href="http://en.wikipedia.org/">http://en.wikipedia.org/</a><wbr></wbr>wiki/Diff#Algorithm) to attack this problem:<br/><br/><blockquote>This only works when a standard “diff” command is available. See ‘diffexpr’.</blockquote><br/><b>Source:</b> <a href="http://vimdoc.sourceforge">http://vimdoc.sourceforge</a><wbr></wbr>.net/htmldoc/diff.html<br/><br/><blockquote>On Unix-based systems, Vim should work without problem because there should be a “standard” diff program available<br/></blockquote><br/><b>Source:</b> <a href="http://vim.wikia.com/wiki">http://vim.wikia.com/wiki</a><wbr></wbr>/Running_diff<br/><br/>Diff mainly uses the algorithm described in the paper, <i>An O(ND) Difference Algorithm and its Variations</i> (<a href="http://www.xmailserver.o">http://www.xmailserver.o</a><wbr></wbr>rg/diff2.pdf), found from <a href="http://stackoverflow.com/">http://stackoverflow.com/</a><wbr></wbr>questions/805626/diff-alg<wbr></wbr>orithm.<br/><br/>Other references are included in the source code found here <a href="http://ftp.gnu.org/gnu/di">http://ftp.gnu.org/gnu/di</a><wbr></wbr>ffutils/ :<br/><br/><blockquote>The basic algorithm is described in “An O(ND) Difference Algorithm and its Variations”, Eugene W. Myers, ‘Algorithmica’ Vol. 1 No. 2, 1986, pp. 251-266; and in “A File Comparison Program”, Webb Miller and Eugene W. Myers, ‘Software—Practice and Experience’ Vol. 15 No. 11, 1985, pp. 1025-1040. The algorithm was independently discovered as described in “Algorithms for Approximate String Matching”, E. Ukkonen, `Information and Control’ Vol. 64, 1985, pp. 100-118.</blockquote><br/>Code for various languages can be found at <a href="http://code.google.com/p/">http://code.google.com/p/</a><wbr></wbr>google-diff-match-patch/<br/><br/>More discussion here: <a href="http://c2.com/cgi/wiki?Di">http://c2.com/cgi/wiki?Di</a><wbr></wbr>ffAlgorithm</blockquote><span class="qlink_container"><a href="http://www.quora.com/Data-Structures/What-data-structures-matching-algorithims-does-vimdiff-use/answer/Anirudh-Joshi">View Answer on Quora</a></span>http://nellaikanth.tumblr.com/post/69220855534http://nellaikanth.tumblr.com/post/69220855534Fri, 06 Dec 2013 20:16:43 -0500How does golang's memory management compare to Java?<p>Answer by William Ting:</p><blockquote>Go’s garbage collector is fairly rudimentary for now. It is a stop-the-world, mark-and-sweep, non-generational garbage collector. One of the biggest complaints are the periodic pauses when the GC runs.[0] This lag is an issue for latency sensitive applications likes games, audio processing, etc.<br/><br/>On the other hand, Java provides a few different GC implementations that allow for different performance optimizations (e.g. throughput vs latency) including:<br/><br/><ul><li>parallel Young generation collector</li><li>concurrent mark and sweep collector</li><li>incremental low pause collector</li><li>more: <span class="qlink_container"><a href="http://www.techpaste.com/2012/02/default-jvm-settings-gc-jit-java-heap-sizes-xms-xmx-operating-systems/#more-3569" rel="nofollow" class="external_link" target="_blank" onmouseover='return require("qtext").tooltip(this, "techpaste.com")'>default and available JVM GC’s</a></span><br/></li></ul><br/>Also, tuning the JVM / GC is a fairly common Java practice while Go’s GC lacks any tuning knobs.[1]<br/><br/>Keep in mind that Go is a relatively young language with a single company backer. Go 1.1 was released in the last month, and they’re showing >30% improvements in certain benchmarks. That’s a huge flag that the implementation is still immature and there is a lot of room for growth.<br/><br/>In contrast, Java has been heavily used in enterprise for >15 years and the JVM has received a lot of corporation support over that time. I don’t have hard numbers, but there’s easily been 10-100x more developer time and money spent on Java compared to Go.<br/><br/>[0]: <span class="qlink_container"><a href="http://osdir.com/ml/go-language-discuss/2012-11/msg01513.html" rel="nofollow" class="external_link" target="_blank" onmouseover='return require("qtext").tooltip(this, "osdir.com")'>go-language-discuss Garbage collecting stop the world for about 10 seconds </a></span><br/>[1]: <span class="qlink_container"><a href="http://www.oracle.com/technetwork/java/javase/gc-tuning-6-140523.html" rel="nofollow" class="external_link" target="_blank" onmouseover='return require("qtext").tooltip(this, "oracle.com")'>Java SE 6 HotSpot[tm] Virtual Machine Garbage Collection Tuning</a></span></blockquote><span class="qlink_container"><a href="http://www.quora.com/Go-programming-language/How-does-golangs-memory-management-compare-to-Java/answer/William-Ting">View Answer on Quora</a></span>http://nellaikanth.tumblr.com/post/69220727641http://nellaikanth.tumblr.com/post/69220727641Fri, 06 Dec 2013 20:15:12 -0500Why doesn't Google use Ruby?<p>Answer by User:</p><blockquote>Google writes almost all of its infrastructure from scratch. We cant even use jQuery (not because its third party software because it is approved for production use, and has been used), but because of the infrastructure. It would need to support testing (in the google way), parsing protocol buffers, creating Google UI elements (inspect their drop down boxes and you will notice they aren’t native html inputs), xsrf, internationalization, etc. etc. etc. <br/><br/>Closure is a POS. I don’t really care what anyone else at Google says. It’s slooooooooow and extremely verbose. The benefits you get from the compiler are negated by the dependencies of that one package you include which has a dozen other dependencies. Still I would choose closure over Jquery because there would be a mountain of work I would need to do to get Jquery to work on Googles stack. Also all the plugins for jquery would not be authorized, and further negate the benefits.<br/><br/>Ruby is in the same boat. There’s a butt load of infrastructure it would need to support, and Google simply does not care enough to authorize Ruby as a usable language. If one wanted to use Ruby at Google you would have to be in charge of adding support for all the services and infrastructure Google has.<br/><br/>Ruby also does not scale as well as Java/c++. Engineers at Google like to scale one order of magnitude beyond what they anticipate their products load being. This means doing things like writing back-ends in C++ for services with < 100 qps because some day we might have 1000qps and our java backend which is easier to maintain, and much more pleasant to code won’t be able to handle the 1000qps unless we deploy one more cluster….<br/><br/>The final nail in the coffin is the fact that Ruby does not compile to a binary. In order for code to be deployed in Googles data centers, it must be in a binary form. So you would need some sort of binary (like the ruby interpreter).</blockquote><span class="qlink_container"><a href="http://www.quora.com/Google-Engineering/Why-doesnt-Google-use-Ruby/answer/User-9467">View Answer on Quora</a></span>http://nellaikanth.tumblr.com/post/69220665668http://nellaikanth.tumblr.com/post/69220665668Fri, 06 Dec 2013 20:14:29 -0500What is node.js really? And what are some essential pointers?<p>Post by Adriano Stephan:</p><blockquote>What is node.js really? And what are some essential pointers?</blockquote><a href="http://source.quora.com/What-is-node-js-really-And-what-are-some-essential-pointers?srid=huKI&share=1">View Post on Quora</a>http://nellaikanth.tumblr.com/post/69220575772http://nellaikanth.tumblr.com/post/69220575772Fri, 06 Dec 2013 20:13:25 -0500Hadoop RPC mechanism<p>Post by Elazar Leibovich:</p><blockquote>Hadoop RPC mechanism</blockquote><a href="http://hadoop.quora.com/Hadoop-RPC-mechanism?srid=huKI&share=1">View Post on Quora</a>http://nellaikanth.tumblr.com/post/68104177267http://nellaikanth.tumblr.com/post/68104177267Mon, 25 Nov 2013 17:58:58 -0500Hadoop RPC mechanism<p>Post by Elazar Leibovich:</p><blockquote>Hadoop RPC mechanism</blockquote><a href="http://hadoop.quora.com/Hadoop-RPC-mechanism?srid=huKI&share=1">View Post on Quora</a>http://nellaikanth.tumblr.com/post/68104150523http://nellaikanth.tumblr.com/post/68104150523Mon, 25 Nov 2013 17:58:39 -0500Why do Tamils hate Hindi and the people in Tamil Nadu reluctant to learn Hindi?<p>Answer by Venkatesh Pandian:</p><blockquote>Please find my answer inline with Shailesh Nadar’s answer(I hope he doesnt have a problem with that).<br/><br/>The popular idea behind the question is ‘Why Tamils cant follow something that goes for the entire nation?’.<br/><br/>There, I said Nation. How long have we been as a country called India ? History taught us the idea of Bharath as a land of certain people and practices owing to our cultural / heritage interactivity. But were we under a single construct called India or Bharath?? <br/><br/>Take a look at the great Indian kingdoms. The moghuls. Look at the not-ever-before-not-ever-<wbr></wbr>after Maurya Kingdom. <br/><div><img class="landscape qtext_image zoomable_in zoomable_in_feed" src="http://qph.is.quoracdn.net/main-qimg-473d702acdaaa8bce9d2e60468c85cb8" master_src="http://qph.is.quoracdn.net/main-qimg-d0376ae3ea521c4cf002faf84dcedd44" master_w="600" master_h="575"/></div><br/>Look closely. Do you see Tamil Nadu within the spreads of such a vast kingdom (that messed with Persians) ?<br/>Not to brag or to give a wrong idea, But doesnt that means something?<br/>Can you see something that says Chola? Now here is their kingdom <br/><span class="qlink_container"><a href="http://catdirtsez.blogspot.com/2012/11/development-of-chola-empire-in-southern.html" class="external_link" target="_blank" onmouseover='return require("qtext").tooltip(this, "blogspot.com")'>Development of the Chola Empire in Southern India 900 AD - 1300 AD</a></span><br/><br/>My point is, We Tamils have a history. A history that stands out from the rest of the country. A history that puts us in the cradle of Indian Civilization. <a href="http://www.harappa.com/ar">http://www.harappa.com/ar</a><wbr></wbr>row/stone_celt_indus_sign<wbr></wbr>s.html <br/>Tamil Nadu was never the part of the empires that gave the idea of Bharath. It is the british empire that gave as an unanimous identity, Indians. If its not for Brits, Chances are that India would be something similar to European Union. <br/>(Though I wish dearly, for that to be the case).<br/><br/>Lets just leave it there and consider a different scene. Lets pretend that some other country invaded the european union and brought them under one single government calling it Europia. Do you think you can make French, if not for the other european countries, to speak English? <br/><br/>You get the gist of it. We dont want to learn Hindi. It is alien to us. Its grammatical structure is single dimensional while we are used to a wide range of structural independence and very deep grammar definitions backing it up. We were never a part of a Hindi speaking kingdom. When a draft was literally forced down our throat, Our people revolted. <span class="qlink_container"><a href="http://en.wikipedia.org/wiki/Anti-Hindi_agitations_of_Tamil_Nadu" class="external_link" target="_blank">http://en.wikipedia.org/w<wbr></wbr>iki/Ant…</a></span> <br/>Personally, I was repelled by the idea of revolting against the Democratic India’s policy. But seeing some of my friends who were born and brought up in Tamil Nadu being unable to read or write in Tamil owing to their CBSE education, I get the point. If the Draft that mandates Hindi as the second language had its way, I am looking at almost all my Tamil population being unable to Read/ Write in Tamil. You know what that means right? Linguicide and Language Attrition. You take out the language of an ethnic group, They have lost their culture. Dont trust me. Ask the first batch of christian missionaries that came over to India.<br/><br/>We Indians are so related. The relation that spreads its root from Indo-Aryan languages to physical appearance to culture. But still, We have to put something else in our mind as well.. Everybody gets to enjoy their own freedom. Freedom of language as such. You can choose to respect that and stop cribbing about us not speaking Hindi.<br/><br/>p.s: I have been told that I am not an ‘Indian’ at all if I dont speak Hindi. Avoid that!! Its a 2 edged knife.</blockquote><span class="qlink_container"><a href="http://www.quora.com/Tamil-People/Why-do-Tamils-hate-Hindi-and-the-people-in-Tamil-Nadu-reluctant-to-learn-Hindi/answer/Venkatesh-Pandian-1">View Answer on Quora</a></span>http://nellaikanth.tumblr.com/post/67996990804http://nellaikanth.tumblr.com/post/67996990804Sun, 24 Nov 2013 17:02:51 -0500