How Do I Conduct a Simple Search?
Am I ready to search?All search has a target -- some collection of text within which you wish to find something. With the Personal Computer version of the WCT Reader, there is no problem. There is no search box until you Select a target ebook.How do I search for a single word?
The Internet version is different. When you arrive at a page offering search, the search box is there, opened with a pre-selected target -- either one that the site sets for you, or one listed in a cookie as the one you had open last time you were at that site. Some first timers arriving at a site may see a search box and think that the search target is the entire Internet or some vast collection. That may be the case at some point in the future, but not yet! So check on the title currently open. Is it the one you want? If not, Switch it.Position the cursor over the text box near the top of the page. Click once. Type the word. Then click once on the Search button or press the Enter key on your keyboard. In Screen 1 the word evening has been typed into the search box:In the example, there were 196 records found. How do I get at later results?
The top of the first result page looks like this:
At the top is the name A Collection from G.K. Chesterton, which the table of contents shows to be a "bookshelf" made up of 30 different books, combined here into a single collection. (Recall that a bookshelf and an ebook behave exactly the same way. The bookshelf variation is simply bigger.)
The word evening appears three times, highlighted with a background color. From this, we know that the word appears in the current collection in lower case, title case (first letter a capital letter), and in upper case (every letter a capital letter). 196 records were found. In some of those records the word occurs more than once. That is what the Frequency score is about. "Evening" appears 4 times in the first record, and three times in the second record. There is a headings line (in this case, book title followed by chapter title) before each record summary. The summary is normally incomplete; typically there are scattered series of three dots -- each series an ellipsis that indicates missing words. The summary is intended to give a first impression of what is in the record. To see any record, click on the Expand button or large plus sign to the left. Screen 3 results if you click on Expand before the first record:
You would need to scroll down to see the rest of that record.
Notice the turquoise arrow in the toolbar at the upper left. Click on it. You are taken back to the last screen before this one -- in this case, to Screen 2, the result page. Next, scroll to the bottom of the result page. It looks like Screen 4:
Let's notice a series of features in the lower part of Screen 4 above:That row of numbers near the bottom deserves more attention.
- The headings line in Entry 10 ends up with "Subheading B". That means that Chapter VII is quite long; extra subheadings were inserted at intervals so that you can easily go to or return to points part way through the chapter.
- The "Meaning Score" is 100 for every record. This is true of all 196 records found. Meaning scores are calculated as 100 less the count of intervening words. When searching for a single word, there can be no intervening words. There is more on meaning scores in the discussion of searching on multiple words.
- When meaning scores are the same, results are sorted in order of decreasing frequency. All the records in sight in Screen 4 above have frequency scores of 2. The word "evening" is there twice. We will return to frequency scores below.
- Near the bottom there is a line 1-10 11-20 21-30 ... 101-110. This page contains results 1 to 10, so the "1-10" part of the line is red. All the remaining pairs are blue links.
- There are three buttons, Home, Preferences, and Advanced. These buttons appear at the bottom of every results page to give you easy access to other important options.
If you click on (for example) 81-90 above, then scroll down, you are shown Screen 5:How do I search for multiple words?
As you select later pairs, the line of numbers across the bottom changes. You can now get at even later hits. You can get to the later hits faster by changing the number of hits in each results page. Do this by clicking on the Preferences button and selecting some higher number of results per page. The range is from 10 to 100 per page. See Screen 6:
Notice the scroll bar in the selection. You have to use it to get at numbers higher than 39 in the example. As always, you have to click on Save Preferences to make a new setting stick. The preferences are saved for the next time you use the WCT Reader, provided you leave the program through a normal exit.Now the fun begins. Position the cursor over the text box near the top of the page. Click once. Type two or more words, for example, morning light, each separated by a space. Then click once on the Search button or press the Enter key on your keyboard. Scroll down in the first results page to see Screen 7:Recap: What is a meaning score?
Two features of Screen 7 are worth noting ... the meaning score and the frequency score.When searching for a single word, meaning scores were uniformly 100. The meaning scores that are visible in Screen 7 are: 100 (once), 99 (three times) and 95 (once). If the whole page were visible, it would show hits 1 through 5 all score 100. That means that there are NO intervening words between the words that we requested (morning light). Hits 6, 7, and 8 are contain "light of morning", that is, exactly one word in between "morning" and "light" (in whichever order). The score in each case is 100 less the count of intervening words (one) equals a meaning score of 99. Hit number 9: "of a curiously clear" (4 words) intervene between "light" and "morning"; the meaning score is 100 less 4 equals 96.Recap: What is a frequency score?
If you try this example in the G.K. Chesterton Collection, you will find that hit # 10 has "light was burning in the broad morning" for five intervening words, score 95. Hit # 11: "As he swung himself up also into the evening light he felt as if he were rising on enormous wings. Legends of the morning of the world which he had heard in childhood" -- 13 intervening words, score 87. Hit # 11 is not really on topic, whereas hits 1 through 10 all had to do with light in the morning. Hit # 12 with a meaning score of 84 is quite good: "a white weird morning when the mists were slowly lifting -- one of those mornings when the very element of light appears as something mysterious and new".
The point: Because results are arranged in order of decreasing meaning score, the hits early in the list are often right on what you are after, whereas the later a hit appears in the list, generally the less assurance that it is on topic.If you were to expand any of hits 5, 6, 7, or 8 you would find only two words that have background colors -- one each of morning and light. Each of these hits has a frequency score of two -- a total of two occurrences of words that you requested.How do I recall where there is discussion of an idea?
Hit # 9 shows a frequency score of three. To see why, click on its Expand button. Here in Screen 8 is a part of the result:
The frequency score is a count of the number of occurrences of the terms you requested within the result. Frequency scores are used to break ties between results with the same meaning score. Why? Because if the words you are after occur 23 times in one hit and only 11 times in another hit, the one with 23 hits is a bit more likely to be on topic.
A side note: If you set a high value in your Preferences for how much is shown on your screen, the frequency score may be higher, since there is more text available in the hit. The comparison is still fair, since all hits in a search result share the same maximum length as set in Preferences.Did Gilbert Keith Chesterton have a fixation on cows? Possibly. In this 30 book subset of his writing, there are 37 records that have either cow or cows. (Admittedly, other topics were much more important to him.) Suppose you recall that Chesterton had said something about history of cows, and you wish to find it again. One way is to input the two words history cows and see what comes up. Here in Screen 9 is the happy result:What is the limit on the number of words I can enter in a simple search?
Chesterton neatly skewered the Marxist view of history in three delightful segments of A Miscellany of Men. Simply expand the hits, read, and enjoy.
In the Advanced Search, we will look at further methods to focus a search quickly on a topic of interest.Fifteen. Chances are you will never input that many words in for a simple search. The reason is that the more words you require, the less likely that all of them appear near close together to each other. Simple search uses what is called Boolean AND logic ... this word AND that word AND the other word, etc. If a word is missing in a record, that record will not be included in the results list. It's quite difficult to input 15 words which will all be present, unless they are very common words.Searches so far have been for "within 20 words of each other". Can I adjust that count?
Try this experiment. Click in the empty search entry box so that it can accept what you type. Then input about six or eight very common words, for example, and the this is at to when that. Leave a space between each word. It helps to spell them correctly! See the eight words in the search box near the top of Screen 10:
With the words in place in the search box, either press the Enter key or click on the Search button to the right of the text box. If you have chosen really common words that are used in the ebook currently open, chances are you will get a results page similar to Screen 11:
If there are no hits, the experiment did not come out as expected. Click at the end of the text box, then use the backspace key to remove some of the words. But make sure there are two or more words left. If nothing else works, there are few ebooks that do not have combinations of the two words and the near each other.
Notice:Colorful, isn't it?
- You can't do this experiment on most search engines. They keep you from searching on the really common words, so-called "stop words".
- Even when you request many words, the search is fast and powerful. On the personal computer version, the results appear instantaneously.
- Each word you asked for appears highlighted with a distinct background color.
The count may be set at any number from 99 down to 0. You can control that through a question in the Preferences page, as in Screen 12. (Click the Preferences button on the Home page, then scroll down to get to this display.)
What is the point of this preference? Recall from the Overview page:Ideas are expressed in words. When the words you search for are widely scattered -- maybe paragraphs apart -- they are not related to the idea you are after. It is worse than useless for a search engine to offer results in which the desired words are far apart; you are left to dig through irrelevant results to get at what you are after. Our technology filters out the words-far-apart results. Words Close Together presents only the results in which the words you ask for are close together, and it ranks the results so that the closer the words are together, the nearer a result is to the top of the list. This form of relevance ranking makes it very likely that the results at the top are meaningful and that they relate directly to the idea you have in mind.This preference sets one of the filters. By setting the ceiling lower, you are saying, "Don't bother to show me results where the words I want are spread out further than this." Set the ceiling higher to see more.
|
|||||
| words close together.com | The "Research Quality" Search Engine by Marpex, Inc. | ||||