Public Hub Guidelines: Difference between revisions

From genomewiki
Jump to navigationJump to search
Line 22: Line 22:
* Using two dimensions: use dimX and dimY (e.g. dimensions dimX=cell dimY=mark)
* Using two dimensions: use dimX and dimY (e.g. dimensions dimX=cell dimY=mark)
* Using more than two: use dimX, dimY on the most important dimensions. Then use dimA,B,C as needed on lesser dimensions. (e.g. dimensions dimX=cell dimY=mark dimA=donor_id)
* Using more than two: use dimX, dimY on the most important dimensions. Then use dimA,B,C as needed on lesser dimensions. (e.g. dimensions dimX=cell dimY=mark dimA=donor_id)
* The ABC dimensions should probably use [http://genome.ucsc.edu/goldenPath/help/trackDb/trackDbHub.html#filterComposite filterComposite] (e.g. filterComposite dimA)
* The A,B,C.. dimensions should probably use [http://genome.ucsc.edu/goldenPath/help/trackDb/trackDbHub.html#filterComposite filterComposite] (e.g. filterComposite dimA)
* Each dimension and views should be represented in [http://genome.ucsc.edu/goldenPath/help/trackDb/trackDbHub.html#sortOrder sortOrder], ideally in order of dimX, dimY, dimA,B,C, view (e.g. sortOrder cell_type=+ mark=+ donor_id=+ view=+). But the hub user may wish for a different sortOrder, which is fine.
* Each dimension and views should be represented in [http://genome.ucsc.edu/goldenPath/help/trackDb/trackDbHub.html#sortOrder sortOrder], ideally in order of dimX, dimY, dimA,B,C, view (e.g. sortOrder cell_type=+ mark=+ donor_id=+ view=+). But the hub user may wish for a different sortOrder, which is fine.
* Tags of subGroup/dimension should be short and sweet with no special chars. Also labels can have HTML codes embedded (e.g. NOT CPG_methylation_%=CPG_methylation_% RATHER mpct=CPG_methylation_&_#37)
* Tags of subGroup/dimension should be short and sweet with no special chars. Also labels can have HTML codes embedded (e.g. NOT CPG_methylation_%=CPG_methylation_% RATHER mpct=CPG_methylation_&_#37)

Revision as of 05:28, 18 August 2013

Suggestions for Public Hubs

Based on common problems seen in hubs, this page outlines recommendations from the UCSC Genome Browser engineers.

Guidelines for User Experience:

  • Have no more than 10 tracks with visibility set to display on as default upon first opening your hub.
  • Have a description page for every configuration page (composite or stand alone track).
  • The description page should preferably contain UCSC's standard Description, Methods, Contacts... sections as defined here under "html".
  • The description page MUST have a contact email address prominently displayed.
  • Note that multiple composites/tracks can use the same description page by using the html setting.
  • Related tracks should be combined into composites where appropriate. The hub track group should not be overwhelming with individual tracks when they can be combined into a meaningful composite organization. Such use of composites will make user configuration easier.
  • Extremely large hubs may use superTracks as well to achieve a meaningful hierarchy.

Guidelines for Composites:

  • Have multi-view only when there is more than one view. Views ideally give alternate access to the same data (e.g. signals and called peaks). Keep in mind that the value of views is that they allow for more than one data/configuration type (e.g. bigBed and bigWig) in a single composite. All subtracks of a view must have the same data type. Likewise, all subtracks of a non-multi-view composite must be the same type.
  • Never represent the same subgroup in both view and as a dimension (e.g. NOT dimensions dimX=view). For that matter a subgroup should never be in two dimensions (e.g. NOT dimensions dimX=cell dimY=mark dimA=cell). The composite will appear to function but multiple ways of selecting the same thing will create a confusing and inconsistent User Interface.

Guidelines for Using Dimensions:

  • There should be no dimensions with a single entry (do not have only one cell line represented in dimX=cell), unless data growth is expected to fill in additional entries.
  • Using only one dimension: preferably use dimX (e.g. dimensions dimX=cell). This saves vertical User Interface space, but is not always the best choice.
  • Using two dimensions: use dimX and dimY (e.g. dimensions dimX=cell dimY=mark)
  • Using more than two: use dimX, dimY on the most important dimensions. Then use dimA,B,C as needed on lesser dimensions. (e.g. dimensions dimX=cell dimY=mark dimA=donor_id)
  • The A,B,C.. dimensions should probably use filterComposite (e.g. filterComposite dimA)
  • Each dimension and views should be represented in sortOrder, ideally in order of dimX, dimY, dimA,B,C, view (e.g. sortOrder cell_type=+ mark=+ donor_id=+ view=+). But the hub user may wish for a different sortOrder, which is fine.
  • Tags of subGroup/dimension should be short and sweet with no special chars. Also labels can have HTML codes embedded (e.g. NOT CPG_methylation_%=CPG_methylation_% RATHER mpct=CPG_methylation_&_#37)

Miscellaneous Guidelines:

  • The use of metadata lines can be supported, users need to be well aware that support may be replaced by another system in the future.

Public Hub Examples

The browser's public hubs provide excellent resources to see how others have created hub structures. Use the Hub Track Database Definition glossary as a reference for interpreting lines. For an excellent example of hub configuration and documentation, please see the ENCODE Analysis hub:

http://ftp.ebi.ac.uk/pub/databases/ensembl/encode/integration_data_jan2011/hub.txt http://ftp.ebi.ac.uk/pub/databases/ensembl/encode/integration_data_jan2011/genomes.txt http://ftp.ebi.ac.uk/pub/databases/ensembl/encode/integration_data_jan2011/hg19/trackDb.txt http://ftp.ebi.ac.uk/pub/databases/ensembl/encode/integration_data_jan2011/hg19/uniformTfbs.html http://ftp.ebi.ac.uk/pub/databases/ensembl/encode/integration_data_jan2011/hg19/uniformRNA.html

Regarding creating meaningful html documentation, if you are creating a hub based on a paper, we suggest the paper's abstract as a useful start for your track's Description section. The Methods section should have more detail, and please include a contact for questions. Lastly, it is best to assume a broad audience of students as well as researchers. For example, it is best to spell out common acronyms for those who may be new to genomics.