Google and your website: a blind alliance
Suppose you have a website “onlineshopperdowrycom” and when you search for it on Google with keywords “online shopping website” You can get a sneak peek of your website’s page results and other websites related to your keyword. That’s pretty universal, as we all urge Google to search and index our websites. This is pretty common for all eCommerce websites.
A. Your website “onlineshopperdowrycom” is directly allied with Google.
B. Your website and your web server (where you store all usernames and passwords) are directly related to each other.
C. Alarmingly, Google is indirectly allied with your web server.
You may be convinced that this is normal and not expect a phishing attack using Google to retrieve information from your web server. Now, on second thought, instead of searching “online shopping website” on Google, what if I search “online shopper website usernames and passwords”Will Google be able to provide the list of usernames and passwords for the online shopper website? As a security consultant, the answer will be “MAYBE, SOMETIMES!”, but if you use Google dorks (proper keywords to access Google), the answer will be a big “YES!” if your website ends up with missing security settings.
Google Dorks can be intimidating.
Google appears as a guardian of service until you see the other side. Google may have answers to all your queries, but you need to frame your questions correctly and that’s where GOOGLE DORKS steps in. It is not a complicated software to install, run and wait for results, rather it is a combination of keywords (title, inurlwebsite, text, allinurl etc) with which you can access Google to get exactly what you are looking for.
For example, your goal is to download JAVA related pdf documents, normal Google search will be “free java pdf documents download” (free is a required keyword without which any Google search is incomplete). But when you use Google dorks, your search will be “file type: pdf in text: java”. Now, with these keywords, Google will understand what exactly you are looking for compared to your previous search. Plus, you’ll get more accurate results. That looks promising for effective Google search.
However, attackers can use these keyword searches for a very different purpose: to steal/extract information from your website/server. Now, assuming I need usernames and passwords that are cached on the servers, I can use a simple query like this. “file type: xls password site: en”, this will give you the Google results of the cached contents of different websites in India that have saved usernames and passwords. It’s as simple as that. In connection with the online buyer’s website, if I use an inquiry “file type:xls passwords inurl:onlineshopper.com” the results could discourage anyone. In simple terms, your private or confidential information will be available on the Internet, not because someone hacked your information, but because Google was able to recover it at no cost.
How to prevent this?
The file named “robots.txt” (often referred to as web bots, rovers, crawlers, spiders) is a program that can traverse the web automatically. Many search engines like Google, Bing, and Yahoo use robots.txt to scan websites and extract information.
robots.txt is a file that gives search engines permission what to access and what not to access from the website. It is a kind of control that you have over the search engines. Setting up Google dorks is not rocket science, you need to know what information will and will not be allowed in search engines. The sample configuration of robots.txt will look like this.
Unfortunately, website designers often overlook these robots.txt settings or set them inappropriately. Surprisingly, most of the government and university websites in India are prone to this attack, revealing all the sensitive information about their websites. With malware, remote attacks, botnets, and other types of high-level threats flooding the Internet, Google’s fool can be more threatening, requiring a working Internet connection on any device to retrieve any sensitive information. This doesn’t end with sensitive information recovery alone, as with Google idiots anyone can access vulnerable CCTV cameras, modems, mail usernames, passwords and online order details simply by searching on Google.