How to Search the Canary Database
The Canary Database search interface uses several techniques
to make searching seem to "just work." Our goal is that
anyone familiar with basic Pubmed or Google searching will feel
right at home, and also that more experienced searchers will be
able to reuse familiar syntax and techniques.
To learn more about how to search, see:
Basic Searching
To search the Canary Database, enter any search terms in the
search box at the top of the screen. All search terms you
enter will be required for a record to match. By default,
search terms will be matched against all available data. See
"Query Syntax" and "Searchable
Fields" to learn how to construct complex searches, and
to search specific fields.
The Canary Database uses several vocabularies to match different
names of equivalent concepts. For example, searches for "dog",
"dogs", or "canis familiaris" should all match any record that
has been curated with the species "Dogs". Similarly, searches
for either "SARS" or "severe acute respiratory syndrome" should
both match studies curated with the exposure "SARS Virus" or
the outcome "Severe Acute Respiratory Syndrome." See
"How It Works" for more information about
the vocabularies we use.
The search process will rank all matching records according to
their relevance to your search terms, and the matches will be
listed in order, with the "most relevant" matches at top.
Query Syntax
Several standard techniques for searching are available in the
Canary Database. Boolean, wildcard, fielded, phrase, and proximity
searching are all possible. Many of these techniques can also be
combined together. Examples of each follow:
| Type |
Examples |
Notes |
| Booleans |
(mosquitofish or trout) and australia
canada not toronto
+tularemia -anthrax
|
Use parentheses to specify logical grouping/precedence.
"and" and "or" combine terms in typical fashion. "not"
requires the following term to not match.
Adding "+" to the left of a search term requires that value to
match; adding "-" to the left of a search term requires that
value to not match.
|
| Wildcards |
terror*
sm?th?
|
"*" added to the right of a search term will match zero-to-many
characters at the end of the term (right truncation). In this
example, it will match any of "terror", "terrorism", or "terrorist".
"?" will match zero or any one character. In this example,
either "Smith" or "Smythe" will match.
Note that wildcards do not work at the left of a search term.
|
| Fields |
smith [au]
smith.au.
"beluga whale" [spec]
species:"beluga whale"
|
Several familiar ways to specify a search field are available.
value [field] is "Pubmed-style" and works for all fields.
value.field. is "BRS-style" and also works for all fields.
See the list of Searchable Fields
below for complete details.
|
| Phrase |
"endocrine disruptors"
"sars virus" [exp]
"sleeping giant state park".loc.
|
Use double quotes to search for a specific value with
multiple terms separated by spaces.
Note that this works with fielded search. Quotes are
optional for single-term values, and can be left out.
|
| Proximity |
Lappivara~ [au]
"ebola chimpanzees"~5
|
Using "~" after a single-word search term will match
spelling variations (i.e. "edit distance") in that
single word.
Using "~5" after a multi-word search phrase will match
multiple terms found within five words of each other
(i.e. "proximity").
|
Searchable Fields
Many fields are indexed and available for searching. To find
specific values for any particular field, specify the field name
in a query like this (for an author search):
daszak.au.
daszak [au]
daszak [author]
author:daszak
"zelikoff jt".au.
"zelikoff jt" [AU]
author:"zelikoff jt"
All of the above will search for the specified value in the
author field. Note that to search for an author name using
both last name and initials, best results will be obtained by
enclosing the last name and initials in double quotes.
All available search fields are listed below. For any particular
field, any of the abbreviated or complete field names may be
searched, and will yield equivalent results.
| Abbreviation(s) |
Field Name |
| 1au |
First author (matches *only* first author) |
| ab, abstract |
Abstract |
| af, affiliation |
Affiliation |
| all |
All fields (Note: This is the default) |
| au, author |
Author |
| exp, exposure |
Exposures |
| gn, grantnum |
Grantnum |
| issn |
Issn |
| is, issue |
Issue |
| jn, journal |
Journal |
| kw, word, keyword |
Keyword |
| loc, location |
Location |
| me, meth, methodology |
Methodology (Study type) |
| out, outcome |
Outcomes |
| pg, page, pages |
Pages |
| pd, date, year |
Publication date |
| rn, registrynum |
Registry number |
| rf, risk_factor |
Risk factors |
| spec, species |
Species |
| mh, sh, subject |
Subject |
| ti, title |
Title |
| ui, uid |
Unique identifier |
| vol, volume |
Volume |
Finding Related Records
When you find a record that interests you, click on the "Related"
tab to find links which search the database for similar records.
Links will be available to search for more records based on
study information such author or journal names, and on curated
data such as species and exposures.
Currently this is limited to search a single similar value
(i.e. "more from this author" or "more about this species"). We
are working to add other ways, included an advanced search screen
where multiple "similar values to search" may be specified, and
other algorithmic means to find "studies like this one."
Still Under Development
We are continually working to improve the database model and indexing
strategies for the Canary Database. Because the studies we curate
include information from a variety of abstracting and indexing sources,
we are exploring additional ways to make advanced searching capabilities
easy to use across these source records. We expect to offer a flexible
advanced search feature soon.
How It Works
The Canary Database uses
PyLucene, a Python
version of the Lucene
information retrieval library, in its search interface. Lucene is very
flexible and very fast, and it allows us to index and search a wide
variety of fields in curated studies.
We use the
UMLS
Metathesaurus to match subject headings and species names from
MeSH and the
NCBI
Taxonomy, and we also cross-reference species names from
ITIS. We use the
USGS GNIS and
NIMA GNS
gazeteers to curate study locations with over seven million geographic
feature names.
We're very grateful to the developers of Lucene and PyLucene for making
such an excellent and sophisticated suite of tools available as Free
Software, and also to the publishers of the vocabularies and related
tools mentioned above, for their excellent, free-of-charge products.
|