Full-Text Search in the Publication Database |
The full-text search function of the Publication Database searches for text in those publication records that also meet the criteria defined in the other fields of the search form. The text search may comprise the entire publication records, or only parts of them. Alternatively, publications can be located through a search of the database of the names of their authors, editors, and so on. This permits - in contrast to a search of the publication records - also search queries with the full first name of a person. These two search modes may be explicitly selected. The search string is first analyzed if the default option "Determine search target automatically" or the option "The search text contains the names of persons (search in name records)" has been chosen or if the search function is directly invoked from the start page of the Publication Database. Depending on the outcome of this analysis, either one of the two modes above or a mixed mode is used. In the mixed mode, the search is limited to the publications of one or several persons if search items could be identified as the names of persons, taking into account the possibly remaining search items as expected contents of a publication entry. The name-based search also takes into account so-called aliases, i.e., different names of the same person (e.g., after marriage). Names may be specified as last names only, or as any combination of long or abbreviated first and last names in an arbitrary order of first and last names. The first names must immediately proceed or follow the last names, though, in the list of search items. (In case a first name is also common as a last name you should specify the name in the form "last name first name" or use the abbreviated form of the first name to avoid unexpected results.) For performance reasons, a maximum of five names can be actively searched for simultaneously; any names in the search string beyond this maximum are ignored. In any case, the result is a list of publications whose data records either meet the search criteria, and / or in which all those persons were involved that could be identified in a search in the name records. The search in publication records, also with specification of the names of authors or otherwise involved persons, is alternatively possible with both search algorithms described below; the search in name records uses only the strict full-text search algorithm. |
|
The publication database offers two complementary full-text search algorithms:
The Google-style search function is faster; it returns lists of publication entries ordered by their relevance with respect to the search string. Particularly in the case of short search strings, its results may occasionally be unexpected. At least in its default operation mode, those publication records are selected that contain at least one of the words in the search string. A greater number of search items therefore increases, in general, the number of records found. Please refer to the following detailed description for the proper application and optimisation of the Google-style search function. The strict full-text search function is activated by setting the Checkbox "Strict search" in the field "Search for". It is significantly slower than the Google-style search function, particularly if the selection of publications is not or not significantly restricted. It returns a list of publication records grouped by publication types as a result, which contains all records that hold all specified search items. A greater number of search items therefore reduces, in general, the number of records found. You can find more information on the strict search function in the description of this algorithm later on this page. The Google-style search function finds only words in the publication records that match exactly one of the words in the search string. (With the optional operator "*", words are also found that begin with one of the words in the search string.) In contrast, the strict full-text search function finds also records that contain words that include one of the words in the search string. With the search string "electron", the Google-style search function finds therefore only records that hold the word "electron" (in upper- or lowercase); the strict text search function would also find "microelectronics". (With "electron*", the Google-style search function would find "electronics" but not "microelectronics". A preceding asterisk ("*"), such as in "*electron" or "*electron*", is ignored by the Google-style search function.) |
|
The Google-style full-text search function |
|
The Google-style full-text search uses a function of the database back-end. The program code of the publication database has therefore hardly an influence on its operation. Since it is much faster than the strict text search function, it has been implemented as the standard full-text search function. This full-text search function returns a list of publication records ordered by the relevance of the records found. For each record, its relevance and its publication type is given. The relevance of a publication record increases with the number of occurrences of any one of the words in the search string within the searched part of the record. In addition, the publication type and quality may influence its relevance. The following rules apply for the Google-style full-text search:
A number of operators can modify the search behaviour. The relevance of the records found changes if at least one of the following operators is used. Search words that occur in more than half of the records are no more ignored. The following operators are available:
The results of a search may differ from those described here if combinations of operators are used. The Google-style full-text search is a feature of the database back-end, and hence out of reach of the developer of the publication database. The following examples - taken from the documentation of the database back-end - may serve to illustrate the usage of these operators:
|
|
The strict full-text search function |
|
This search mode is activated by setting the Checkbox "Strict search" in the field "Search for". If the search string contains several words the Publication Database by default returns all otherwise matching records that contain all words of the search string, regardless in which order and in which of the fields to be searched these words occur. The following characters are separators between words:
The search text "This is an example" results in a search for records that contain the four words "This", "is", "an", and "example" in arbitrary order and in arbitrary locations within the text fields that have been specified with the selection list "Text search in:". Search items that consist of one of the above separators or contain a separator but should not be split at the separator can be put between double quotes ("). Pairs of quotes are removed before the actual search; they prevent, however, the splitting of the text between them. The search text may contain an arbitrary number of search items in double quotes. Hence, the search text ""This is" "an example"" results in the two search items ""This is"" and ""an example"". To search for one of the above separators, place it between double quotes (e.g., ""+"" to search for a plus sign). It is possible to use search items that contain one double quote; they must be specified after all search items in pairs of double quotes. You can put the entire search text in double quotes, e.g., ""This is an example"".) Use this feature if you are quite sure that there are records with your search phrase but the default settings return too many search hits. Please note:
|