About PDB At A Glance

The PDB structure entries, consisting of a collection of files having nondescript names, cannot be easily grasped in a biochemically meaningful context. Manually organizing the structures based on the descriptive information in the files is becoming less and less practical as the database expands.

A chemically or biologically meaningful context can be provided by the user in the form of a search keyword (e.g. hemoglobin), but the range of available contexts cannot be predetermined from the database itself--users must know, in general, what they are looking for. Although searching is an extremely useful approach for locating specific PDB entries, the scope of the database is best ascertained by browsing a set of predetermined contexts. Useful contexts include molecular classes (e.g. "cytochrome"), secondary/tertiary structural classes (e.g. "globin fold") functional classes (e.g. "binding protein"), species of origin, and experimental determination method.

The descriptive information in the PDB files is distributed between a set of fields (e.g. "HEADER"). The PDB structures can be classified in a given context by searching the information in these fields using pre-defined search strings. Previous versions of PDB At A Glance were based on manually compiled static lists that required updating with each new version of the PDB. In the past, automated searching could not discriminate between applicable and inapplicable descriptive information in the PDB files, thereby resulting in an unacceptably large number of misclassifications. A field-specific search engine has recently been developed, paving the way for an automated PDB classification system.

PDB At A Glance consists of a set of pre-defined biochemically meaningful search contexts (accessed by keyword). Every attempt has been made to select a set of keywords (e.g. "protease") that represent the entire territory of the database. However, some structures may have been unintentionally omitted (please let us know if you find any). If a structure cannot be found in any of the keyword lists, a user-defined search should be performed using Molecules R Us.

Each of the structural classifications in PDB At A Glance have been separated, where applicable, into the following sub-types: non-complexed (generic, inhibitor, or activator) and complexed (generic or inhibitor). This classification is based on the presence or absence of the words "complex", "inhibitor", or "activator" in the "COMPND" or "HEADER" fields of the PDB files. In most cases, macromolecules bound solely to metal ions, small inorganic ligands (e.g. molecular oxygen), H+, etc. are not designated as complexes by PDB contributors, and therefore, are not picked up as complexes by the search.

Legend
Generic Non-complexed
Generic Complexed
Inhibitor-Complexed
Non-Complexed Inhibitor
Non-Complexed Activator


Related PDB Browsing Tools

"The Annotated Guide to the Brookhaven Protein Data Bank" (Laura Lynn Walsh, University of Illinois) is a flat listing of PDB entries according to molecular class and species of origin, with sub-classifications for structure and experimental determination method.

PDB entries are listed in the context of secondary/tertiary structure in the "Structural Classification of Proteins (SCOP)" (Cambridge Univ.) WWW server. This resource makes use of hypertext to provide a hierarchical organization and direct links to an interactive viewer.