Syndicate content

Frequently Asked (not so smart) Questions

Abir Qasem's picture

This blog is the second of the series of a year-long skills transfer discussion/blog series on technology aided gut (TAG) checks. We use an interactive and just in time learning strategy to help you learn to do TAG checks on your data. 

People using computers in an internet cafe in Kampala, UgandaMany of us fondly remember from our school (and college) days the best and the most inspiring teachers always told us that "there are no bad questions".  No matter how silly our questions were, the best teachers always had the talent to transform an uninformed question into a learning experience. Even in the age of AI (Artificial Intelligence) that quality is still uniquely human (Google or even IBM’s Watson are not there yet)!  So, for an adult learner, who is using online resources to learn technical skills, ­asking the right question is important.  If you don’t ask the right question, the Internet will not give you an answer. Even worse than not getting an answer, you may get the wrong answer. This blog is all about asking the right question. More specifically, this blog is about coming up with precise and specific search queries when you are searching online resources to further your knowledge or solve a specific problem.

The Internet is the world's largest knowledge repository, but it is still far from becoming a one-stop knowledge shop. We still need a vast education industry (in the US, it is close to a trillion dollars) consisting of teachers, mentors, training, schools/colleges etc. Unlike machines, and, by extension, unlike the Internet, we humans have an unequalled capability to deal with ambiguity. We do not need to always work under a precise set of rules. We also have a propensity to be ambiguous in framing our questions. Therefore, we need expensive human intervention to remove the ambiguity factor from the human-to-machine knowledge loop.

In the physical world, there is a high level of interactivity between the asker of a question and the human provider. This interactivity- coupled with the human ability to deal with ambiguity- helps refine the question by making it precise enough to answer. On the Web, such interactivity is much harder to attain.

One of the most recent solutions to remove ambiguity issues from human to computer knowledge exchange has been the Semantic Web. The Semantic Web envisions a collection of formal and structured data that would augment the present Web to be better equipped to deal with ambiguity. However, an important prerequisite is that those who provide the data will need to mark up their data in such a way that it complies with the Semantic Web. This has proven to be a burden and is considered to be one reason why the Semantic Web has not taken off the way it should. Here is a technical paper by one of the authors of this blog (Abir Qasem) (warning:­ it is a bit dense!) which explores the feasibility of the Semantic Web.

When we think of human­machine interactivity, one the first things that comes to mind are chatbots. Although they have been around for a while, they frequently frustrate humans who are looking for real answers and not just for entertainment. Take Anna of Ikea. If you try to ask even a slightly ambiguous question, she will fumble. She needs to be updated often. This is her state on a Sunday at 3:00 PM at the time of writing this blog. She may be useful to look at Ikea items (something that does not require much interactivity). However, her low availability will make her less useful. Further, she does not provide the real help most Ikea customers look for ­ interactivity in helping assemble furniture.
 
Enter Watson. IBM Watson is “a technology platform that uses natural language processing and machine learning to reveal insights from large amounts of unstructured data”. Much smarter than Anna and the like, Watson makes sense of unstructured data. As the IBM Watson site says, "80% of today’s data (news articles, research reports, social media posts and enterprise system data) are unstructured." Even this blog is unstructured.  So if Watson is so great, what is the problem? The problem is that Watson is not easily accessible by the average user. Technical knowledge is needed to be able to use Watson. At least for the time being. It also costs money to really use it.

Intelligent systems like Watson (and even chatbots) may work very well in a specific, limited domain where there is control over staff that use it. For instance, there can be a chatbot that deals with charge codes or inventory codes. Lawyers, accountants, development professionals and others use charge codes to capture the cost of their time. If the team lead for a task types in the name of the task that he/she is working on, the chatbot could potentially provide a charge code and a summary of costs charged to the project (as opposed to a printed report). The chatbot could also tackle the question of what other tasks within the same company are similar. The above scenario demands a closed problem domain (projects/task data) and the ability to mandate that employees give feedback about the system, all factors that will help the system to be more effective.

So until we have Watson-­driven systems that are free and easy to use in a number of different domains, the burden is on us the humans to ask precise questions.

Six simple things to make your queries precise

Here are six things to remember to can make your questions more precise so as to get the best results from the “dumb” Internet/machines:

  1. Think before you ask. Typing something in Google is easy. Processing what it returns is not. Searching without thinking can lead you down the wrong path. Five minutes of thinking before asking can save you five hours of agony later on.
  2. Know your domain vocabulary. The same term is used differently in different domains. Even a term as innocuous as technical can mean different things across disciplines
  3. Quantifiable terms are useful. Short questions can help avoid redundancy.  Grammar usually does not matter.
  4. Avoid assumptions or if you have to make assumptions, make them explicit.
  5. Use search engine “tricks”. Know the basics, use advanced search whenever you can. Search for images on Google here. Finally, search operators can be your best friend. For an example of a search operator, when you use a dash before a word or site, it excludes sites with that info from your results.
  6. Keep in mind source quality.  Google will happily give you bunch of results. While Google provides results in order of relevance, the best sources are not always on top. It is a good idea to have a sense of source quality in the domain that you are working on.
Some rules of thumb regardless of the domain:
  1. Active forums are a good place for current advice.
  2. Ask established experts on the web.
  3. General forums are bad. For example, Yahoo forums usually feature in the top results but are unreliable for the most part.
Evolution of a good question and what it gets you

Vague question: How do I cross promote my blog and my web page?
  • Search Results: 39 Actionable Ideas For Driving Traffic To Your Website ... How to use Google+ to Promote your Blog…  Wishpond 5 Creative Ways to Drive More Traffic to Your Blog
  • Verdict: NOT helpful
  • Why: Not specific enough. Think about what you want. For example, what you may really want is to put your blog in your web page
A bit better: How do I pull my blog into my web page?
  • Search Results: I successfully embed my wordpress blog into my website....Embedding into a website using HTML wordpress.org... How do I add a wordpress blog into my existing html website. wordpress.org? ... Can I embed my WordPress content in my website?
  • Verdict: A bit more helpful. But I don’t have a Wordpress blog. I use blogger.com. What should I do?
  • Why: The results assume I have a wordpress blog, but I don’t. Also I don’t want to do this once. I want this to happen automatically. Next time, I will be more specific, and quantifiable.
A bit more precise: How do I pull my blog into my web page automatically? -­wordpress
  • Search Results: The "-" (dash) before "­wordpress" on Google removes all search results with references to wordpress. This gives me the following: How To Display Recent Blog Posts On Your Website.....How can I embed blog postings from blogspot into my google webpage ... How to embed Twitter timelines on your website?
  • Verdict: This was a bit better as I did not get references to wordpress. The results were still not that useful. Wait, what was the eighth result ? It offers a new term: RSS. That may be what I need...
  • Why: 1) More precise-­ I used the word automatically 2) I also excluded all mentions of wordpress
Finally as precise as I can be: How do I import my RSS feed to my web page? For this one I go to youtube.com and search there. I decide I will pick the shortest video. I am tired of wasting time!
  • Search Result: I get the answer I need
  • Verdict:  Perfect!
  • Why:  1) Quality of source ­ I now know that youtube has some good answers, I also checked the upvotes for the video and length, these are useful metrics to confirm and validate the quality of the source 2) Domain vocabulary: I used the term "RSS".
How to find out the RSS feed url of your blog in 60 seconds

Before we say goodbye ­until next month

Asking the right question to the computer/internet is key to getting the right answer. We have shown that it is not as easy as it may seem at first.  In our next blog we will address another interesting problem: how do we know (or be reasonably confident) that the answers we found from an online resource are correct (especially when we do not have the subject matter expertise)?


Follow PublicSphereWB on Twitter!
 

Add new comment