Knowing what we don’t know (on the web)

Welcome to the third blog of the technology aided gut (TAG) checks series. In this year long skills transfer blog series we use an interactive and just-in-time learning strategy to help you learn to do TAG checks on your data.
In our last posting we talked about six techniques to make our questions more precise so as to get the best answers from the Web. In this blog, we look at the other side of the equation: how can we be reasonably confident that the answers we get from an online resource are correct? How can we know that the web has given us the right answer when we do not have the subject matter expertise ourselves?

Path to “Confucian” wisdom

How to know what you don’t know

The adage “True wisdom is knowing what you don't know” has been attributed to Confucius. While addressing this philosophical statement is beyond the scope of this blog, it is appropriate to title a pragmatic article borrowing from ancient wisdom. Knowing what you do not  know is the essential problem of learning in the modern era. Legacy learning depends on teachers and textbooks who you can rely on to be correct. However, for contemporary learning - how can you tell the correct from the incorrect if you don’t have sufficient knowledge of a domain?
We describe a four step process one can use to eliminate the really bad answers and get a decent idea of which ones are very good.
The process may not be able guarantee the answers we got are absolutely correct, but the level of accuracy of the answers we will get by following the process will be useful in most cases.

The process is as follows:

  1. Filter out the really bad answers
  2. Understand each answer, really understand
  3. Confirm quality of the source
  4. And, finally choose the best answer
Step 1: Filter out the really bad answers
Finding the right answer means knowing the wrong answers. While there is no foolproof way to weed out the incorrect answers, there are some giveaways we can look for:
  • Is the answer changing the question you are asking? (bad)
  • Is the answer concise and concrete? (good)
  • Does the answer have data supporting it? (good)
  • Is the answer an opinion rather than statement of facts? (bad)
Step 2: Understand each answer, really understand
So now, let’s say you filtered the potential answers from 10 to 4. How do we eliminate the rest? Read the answer carefully, and if you run into critical words you do not understand, search the web for their meaning. If it is a definition that you are looking for, there are specific rules you can follow to find what you want, On Google for example, if you are looking for the definition of a word, use the following “define:term” where term is the term you are looking up. Take the following question:
How can a you make a profit on the stock market?
Let’s say one answer found is:
“By going short an overpriced security, while concurrently going long the portfolio the Arbitrage Pricing Theory calculations were based on, the arbitrageur is in a position to make a theoretically risk-free profit.  
A careful reading of the answer would mean understanding the terms: portfolio, the Arbitrage Pricing Theory, going short, and security.
Let’s say you are able to google the terms and are able to find out what each term means. How much time should you spend in your search before you conclude it is too much? A useful rule of thumb could be what we can term as the 5-2 rule. If there are more than 5 words you don’t understand, the resource is probably in a source you are not ready to use yet. If the words are less than 5, but for each word you have to do more than 2 searches, then the resource is probably not your best bet. If there are terms you are unable to locate or a logic is described that you do not understand, it could be that the answerer is making non transparent assumptions. In such cases, without making the assumptions explicit, it will be difficult for you to understand the answer and such answers should be discarded despite the fact that they may be correct.
Step 3: Confirm quality of the source
Let us say you have narrowed down the potential answers from 4 to just 2. Are there any metrics you can use to confirm the quality of the responder and the source ? Many question and answer tools allow users to upvote or like an answer. Keeping timeframes in mind (very popular answers from a while ago may be obsolete in the present), lots of upvotes are good, downvotes are bad. The source quality of the site should also be assessed. At this point, you may want to develop and use your own metrics, and ways to verify quality.
Take this question:

One of the answers to this question, offering a workaround as a solution has 164 votes - quite a high number, suggesting the quality of the answer is likely to be high.

Step 4: And, finally choose the best answer
Now, use any additional metrics you might want to apply and choose the right answer.
We should note that the guiding principles we to used develop the process are the same ones we used to come up with the techniques to form precise queries in our last blog. To ask the right question and to choose the right answer requires proficiency in the terms used in the domain where you have a question. Developing the vocabulary, understanding the assumptions that the responder may be making, getting a sense of the quantification needed, and familiarity with reputable sources, require a well defined process such as the one we have described above. After a while, the set of informal rules transform into a “gut-feeling” or an intuition for that domain.

Before we say goodbye - till next month

In this blog we described a process that helps us be reasonably confident that the answers we find from an online resource are correct, especially when we do not have the subject matter expertise. In the next two blogs we take a critical look at some contemporary knowledge formats. Animated Gifs and Videos - do they teach or entertain?  Is interactivity a good thing? Always? We look at the pros and cons of animated gifs and videos in providing technical knowledge to adult professionals.

