Google
 
Web asimjalis.blogspot.com

Saturday, January 22, 2005

Google Counting

by Asim Jalis

I wrote a small program that tells you how popular a word or phrase is on Google. It's a good way to compare different things for Google popularity. (See the end of this post for the code.) For example, to see how many times the words cat and dog appear on the web, I typed: gcount.sh cat dog. Here are the results that I got: cat 128,000,000 dog 83,000,000 Clearly, cat-lovers dominate the web. (Another possibility is the word cat has other meanings that people are discussing. I was also curious to see how popular the different Myers-Briggs personalities were on the web. So I ran gcount.sh with the different types and got these results. Here are the results I got: intp 166,000 intj 80,100 infp 102,000 infj 63,000 istp 268,000 istj 48,400 isfp 30,300 isfj 23,200 entp 61,500 entj 21,400 enfp 46,900 enfj 21,500 estp 50,900 estj 37,300 esfp 22,700 esfj 17,100 It's fascinating that even though ISTPs and INTPs make up a tiny portion of the population they are so well represented on the web. Of course, it is a little more complicated than that. The number of Google hits must be a function of: (a) the number of people writing about a type (presumably their own), (b) the number of people familiar with MBTI and interested enough in it to talk about it, (c) people talking about other meanings of these acronyms. Interestingly, the least represented group is the ESFJ followed by the ENTJ. I wonder what's going on here. Here is the script, in case you are interested: for i in $* ; do i=$(echo $i | sed 's/ /+/g') url="http://www.google.com/search?hl=en&q=$i" ; echo -n "$i " lynx -dump -nolist $url | grep -i 'Web *Results' | sed 's/^.* of about \([^ ]*\) for.*$/\1/' done To run this you must be on a Unix/Linux machine (or have Cygwin on Windows). Paste this into a file called gcount.sh. Then invoke it in the shell with different search arguments. E.g. gcount.sh intp intj estp estj