Wildcard Search

Wildcard searches are useful when searching for terms that may vary in spelling or to explore a topic before narrowing your focus. A wildcard query can be formed using an asterisk (*) for multiple character replacements or a question mark (?) for single character replacements.

Single Character Replacement

Single character replacement is executed by using a question mark (?) in the place of an unspecified character in a search term. This type of search is most useful when the spelling of a word commonly varies or is often misspelled. For example:

    NF?B

The term NFκB can appear in multiple variations, with the proper Greek letter kappa or an approximation of kappa, where a lower or uppercase “k” is used in place of the letter kappa. Using a single character replacement search will return records that contain the letter kappa, lowercase “k,” and uppercase “k.” It should be noted that while this syntactic approach is more likely to return more results about NFκB, it will also return records that contain other characters that occupy the same spot as the wildcard.

Multiple Character Replacement

Multiple character replacement is executed by using an asterisk (*) in the place where zero, one, or several characters may hold that place in a search term. To continue our example from above, multiple character placements can be used to account for several unspecified characters. For example:

    NF*B

The above example will return results where the Greek letter kappa is spelled out, e.g., NFkappaB, as well as NFκB, NFkB, and NFKB.

Rules for using Wildcards

  • A multiple character wildcard replacement (*) query allows the query to substitute any number of characters in its place, including no characters
  • A single character wildcard (?) query matches terms differing by one character but cannot match an empty character, hyphens, or whitespace
  • Wildcards can be used in single terms in the beginning, middle, or end, but not within a quoted phrase. Multiple single terms can be queried with wildcards, but wildcards do not work within the confines of quotation marks. Terms containing dashes will also not work with wildcards. Dashes are converted to spaces by the search engine and the resulting text is grouped within quotes to keep the terms together improving search experience.
  • When using a wildcard at the beginning of a term, it is normal to see the results return a bit more slowly. This is due to the fact the search engine has to search every term within the corpus to find all the terms that match the criteria. A search for “g*” will take longer than a query for “ge*” as the subset of terms that begin with “ge” is much smaller than the subset of terms beginning with “g.”

Lastly, wildcards are a powerful query syntax tool, however, while wildcards return more results, it may do so at the expense of precision. Wildcards, by nature, are intended to search the corpus for every instance of your search term with terms that match the beginning, middle, and end of your term criteria, regardless of its relation to your search topic. As with all searches, regardless of the technique used, always review the results for accuracy before continuing your analysis.