|
Forming a query
CSISS's search can be as simple or as complex as you
need it to be. Usually you will just need to enter
a few words that best describe that which you are trying
to locate. To perform more complicated searches you
might use any combination of logic operators, special
pattern matchers, concept expansion, or proximity operations.
Example: geographic information system
Query Rules of Thumb
If you get too many junk results, try:
- Add some more words to your query.
- Decrease the range of the Proximity control.
- Change the Word Forms control to Exact.
- Look at the Match Info and see why they are showing
up.
- Use the Exclusion Operator (-) to remove
unwanted terms.
- If you are searching for a phrase, hyphenate
the words together.
If you don't get any answers, or just too few:
- Remove some
more words to your query.
- Examine your
spelling.
- Increase
the scope of the Proximity control.
- It just might
not be there?
Overview of query abilities
Controlling proximity:
Mastering the usage of proximity gives the ability
to locate answers with greater precision. The Site
Search input form gives you several options to control
the search proximity:
- line All query terms must occur on the
same line
- sentence Query items should all reside within
the same sentence
- paragraph Within the same paragraph or text
block
- page (default) All items must occur
within same HTML document
Ranking Factors
The ranking algorithm takes into consideration relative
word ordering, word proximity, database frequency,
document frequency, and position in text. The relative
importance of these factors in computing the quality
of a hit can be altered under RANKING FACTORS on
the Options page.
Keywords Phrases and Wild-cards:
To locate words, just type them in as you would in
a word processor. Letter cases will be ignored. The
wild-card character * (asterisk) may be used
to match just the prefix of a word or to ignore the
middle of something. To locate a number of adjacent
words in a specific order, surround them with " (double
quotation) characters. Putting a - (hyphen)
between words will also force order and one word proximity.
Examples:
| Query |
Locates |
| john |
john, John |
| "john public" |
John Public |
| web-browser |
Web browser, web-browser |
| John*Public |
John Q. Public, John Public |
| 456*a*def |
1-23456-789-ABCDEF |
| activate |
activate, activation, activated... (see Word
Forms) |
Applying Search Logic
The CSISS search engine uses set logic for text queries.
Set logic is easier to use and provides more abilities
than boolean. The examples below make reference to
single keywords, but keep in mind that each keyword
can represent an entire list of things or any of the
special pattern matchers.
Sets (or lists) of things are specified by placing
the elements within parenthesis, separated by commas.
Example: (bob,joe,sam,sue) . In the examples
below, you could replace any of the keywords with a
list like this.
The default behavior of the search is to locate an
intersection (or 'AND') of every element within a query.
This means that the query; "microsoft bob interface" is
the equivalent to the boolean query: "microsoft
AND bob AND interface"
- '-' (without)
The '-'(minus) is the most commonly used logic
symbol. It means the answer should EXCLUDE references
to that item.
- '+' (mandatory)
The '+'(plus) symbol in front of a search
item means that the answer MUST INCLUDE that item.
This is generally used in conjunction with the permutation
operation.
- '@N' (permute)
The '@' followed by a number indicates how
many intersections to locate of the terms in your
query. This may be confusing at first, but it is
very powerful.
Notes: Only the '+' and '-' operations
are valid with a relevance rank search.
Examples:
| Query |
Finds |
| bob sam joe |
Bob with
Sam and Joe (within the selected proximity) |
| bob sam -joe |
Bob with
Sam without Joe |
| bob sam joe
@1 |
Bob with
Sam, or, Bob with Joe, or, Joe with Sam |
| A B C D @1 |
AB or AC
or AD or BC or BD or CD |
| +A B C D
@1 |
ABC or ABD
or ACD |
| A B C -D
@1 |
( AB or AC
or BC ) without D |
Invoking Thesaurus Expansion
The CSISS search engine has a vocabulary of over 250,000
word and phrase associations. Each entry is generally
classifiable by either its meaning or part of speech.
To expand the meaning of a word or phrase within
your query, precede it with a ~ (tilde) character.
Natural Language Query
You may enter a query in the form of a sentence or
question. The software will automatically identify
the important words and phrases within your query and
remove the "noise words".
Example:
User query: What is the state of the art in Geographical
Information Systems?
Actual search: state of the art AND Geographical AND Information
Using word forms
The Word forms options give you control over
how many variations of your query terms will be sought
in your search.
- Exact: (default) Only exact matches
will be allowed.
- Plural & posessives: Plural and possessive
forms will be found. (s, es, 's)
- Any word forms: As many word forms as can
be derived will be located.
Examples:
president
EXACT : president
PLURAL: (above) + presidents president's
ANY : (above) + presidential presidency preside presides presiding presided
tight
EXACT : tight
PLURAL: (above) + tights
ANY : (above) + tightly tightening tightened tighter tightest
program
EXACT : programs
PLURAL: (above) + programs program's
ANY : (above) + programming programmatic programmed programmer programmable
This is called morpheme processing, and it is generally
smarter than a traditional "stemming" algorithm. It
does not truncate the end of a word, it actually checks
to see if it could be a valid form of the search term.
Notes: Thesaurus terms are also treated in
the same manner. Words smaller than 4-5 characters
will not be processed.
Controlling proximity
These options give you control over the region in
which a match must be found.
- line: match terms must be located within
the same line.
- sentence: all terms within the same sentence.
- paragraph: match terms must be located within
the same paragraph
- page: (default) all terms within the same
document.
In all cases the best possible matches for your query
are located and ordered by decreasing quality. A bar
graph is produced to indicate the quality of each answer.
Interpreting search results
When a query is submitted it will come back with another
query form and up to 10 matching documents. If there
are more than 10 answers, a link at the top and bottom
of the list will allow you to view the next 10 in sequence.
The input form at the top allows you further tailor
your query to home-in on the desired answers, or to
submit a completely new query without having to navigate
back to the original input form.
Each answer in the result set will have a format
similar to the following:
1: CSISS Mission
Statement
CSISS, the Center for Spatially Integrated
Social Science, is funded by the National Science
Foundation under its program of support for infrastructure
in the social and behavioral sciences. Its programs
focus on the methods, tools, techniques, software,
data access, and other services needed to promote
and facilitate a novel and integrating approach
to social science. The CSISS mission recognizes
the growing significance of space, spatiality,
location, and place in social science research.
It seeks to develop unrestricted access to ...
http://www.csiss.org/aboutus/mission.htm |
77%
Size: 1K
Depth: 1
Find
Similar
Linked
Sites |
Note: The look and feel described
here is the standard search interface. The interface
may have been customized by the web site administrator.
The components of each result are:
- Result number
- Document title ( clicking on this will take
you to the original document )
- Abstract (The first few hundred characters
of the document )
- Size ( How big is the original document )
- Depth ( How many clicks from the top of the
site )
- Find Similar ( Find other documents similar
to this one )
- Linked Sites ( List pages that link to this
one )
Finding similar documents
The Find Similar link
will find documents that are similar to the corresponding
result. It does this by reading the original document
to ascertain its main subject matter, and then conducting
a relevance ranked search for those subjects.
Result documents are ordered from best to worst match.
The bargraph display will indicate the overall quality
of the match.
Note:The document you click on may not
be ranked as the best match. This is because other
documents may contain more information about the
overall subject matter than the original.
Showing linked sites
Often times it is difficult to navigate using a search
engine because there is no back-link present
on the matching document. The Linked
Sites link solves this. This link will show
other documents that contain hyperlinks to the one
you click on. In other words, it is an automated back
button.
Note: Only some of the CSISS
engines offer this feature.
|