Public Hub Guidelines: Difference between revisions

From genomewiki
Jump to navigationJump to search
(→‎Public Hub Examples: Adding link to filters and to useOneFile on Docs #27255)
(34 intermediate revisions by 6 users not shown)
Line 1: Line 1:
= Suggestions for Public Hubs =
This page is intended to lay out guidelines for those who are trying to create [https://genome.ucsc.edu/cgi-bin/hgHubConnect Public Hubs]. If you’ve created a hub that you feel meets these requirements and is of general interest to the research community, please contact us at [mailto:genome-www@soe.ucsc.edu genome-www@soe.ucsc.edu] to have it added to the list.


Based on common problems seen in hubs, this page outlines recommendations from the UCSC Genome Browser engineers. Please note that hosting hub files on HTTP tends to work even better than FTP because of the difference in the number of open tcp connections needed. (As a reference for interpreting trackDb.txt lines use the ''Hub Track Database Definition'' [http://genome.ucsc.edu/goldenPath/help/trackDb/trackDbHub.html#toc glossary])
(As a reference for interpreting trackDb.txt settings use the ''Hub Track Database Definition'' [http://genome.ucsc.edu/goldenPath/help/trackDb/trackDbHub.html#toc glossary])


==Guidelines for User Experience:==
= Required Guidelines =
* Have no more than 10 tracks with [http://genome.ucsc.edu/goldenPath/help/trackDb/trackDbHub.html#visibility visibility] set to display on as default upon first opening your hub.
The following guidelines must be met before your hub will be added to our public list:
* Add a descriptionUrl html page to hub.txt that includes search terms that will be indexed to enabling finding your hub.  
* Required for both track and assembly hubs:
* Have a description page for every configuration page (composite or stand alone track).
** You MUST have a description page for every configuration page (composite, superTrack or stand alone track). Note that multiple tracks and/or composites can use the same description page with the “html” setting. You can find more information on creating track description pages in the [[#Track description page recommendations | Track description page recommendations]] section below.
* The description page should preferably contain UCSC's standard ''Description'', ''Methods'', ''Contacts''... sections as defined [http://genome.ucsc.edu/goldenPath/help/trackDb/trackDbHub.html#commonSettings here under "html"].
** All of your description pages MUST have a contact email address prominently displayed.
* The description page MUST have a contact email address prominently displayed.
** Try to have no more than 10 tracks with [http://genome.ucsc.edu/goldenPath/help/trackDb/trackDbHub.html#visibility visibility] set to display (in full, pack, dense, or squish) as default upon first connecting your hub.
* Note that multiple composites/tracks can use the same description page by using the html setting.
** A descriptionUrl html page specified in your hub.txt. This should be a URL to a description page for your entire hub, often public hubs will link to a full-text paper or to their laboratory webpage that describes the research presented in the hub.  These links are presented on the Public Hubs page as a hyperlink on the longLabel presented in the hub.txt, while the shortLabel is a hyperlink to the hub.txt location.  
* Related tracks should be combined into [http://genome.ucsc.edu/goldenPath/help/trackDb/trackDbHub.html#compositeTrack composites] where appropriate. The hub track group should not be overwhelming with individual tracks when they can be combined into a meaningful composite organization. Such use of composites will make user configuration easier.
* Required for only assembly hubs:
* Extremely large hubs may use [http://genome.ucsc.edu/goldenPath/help/trackDb/trackDbHub.html#superTrack superTracks] as well to achieve a meaningful hierarchy.
** Add a gateway page for each assembly by having a ''htmlPath'' line for each genome not already hosted by UCSC in genomes.txt. [http://genomewiki.ucsc.edu/index.php/Assembly_Hubs Assembly Hubs Wiki]
** The following settings should properly set in your genomes.txt (The last 3 settings will make it easier to find assembly hub species in hgGateway by UI search):
*** defaultPos
*** scientificName
*** organism
*** description
 
= Recommended Guidelines =
 
These guidelines in the following sections are recommended to improve user experience, but are not required to be implemented before the hub is added to our list of Public Hubs.
 
== Track organization recommendations ==
Related tracks can be grouped in a few different ways, namely [https://genome.ucsc.edu/goldenPath/help/trackDb/trackDbHub.html#superTrack superTracks], [https://genome.ucsc.edu/goldenPath/help/trackDb/trackDbHub.html#aggregate multiWigs], and [https://genome.ucsc.edu/goldenPath/help/trackDb/trackDbHub.html#compositeTrack composites]. If your hub includes a large number of tracks, the grouping of tracks may be necessary. This will prevent your hub’s track group from being an overwhelming mess of individual tracks and can make user configuration of your tracks easier.
 
=== Composite tracks ===
Related tracks of the same data type (e.g. a set of related bigBed tracks) should be combined into [http://genome.ucsc.edu/goldenPath/help/trackDb/trackDbHub.html#compositeTrack composites] where appropriate.  
 
* Have [http://genome.ucsc.edu/goldenPath/help/trackDb/trackDbHub.html#view multi-view] only when there is more than one view. Views ideally give alternate access to the same data (e.g. signals and called peaks). Keep in mind that the value of views is that they allow for more than one data/configuration type (e.g. bigBed and bigWig) in a single composite. All subtracks of a view must have the same data type. Likewise, all subtracks of a non-multi-view composite must be the same type.
* Recommendations for using dimensions with your composite tracks:
** There should be no [http://genome.ucsc.edu/goldenPath/help/trackDb/trackDbHub.html#dimensions dimensions] with a single entry (do not have only one cell line represented in dimX=cell), unless data growth is expected to fill in additional entries.
** Using only one dimension: preferably use dimX (e.g. dimensions dimX=cell). This saves vertical User Interface space, but is not always the best choice.
** Using two dimensions: use dimX and dimY (e.g. dimensions dimX=cell dimY=mark)
** Using more than two: use dimX, dimY on the most important dimensions. Then use dimA,B,C as needed on lesser dimensions. (e.g. dimensions dimX=cell dimY=mark dimA=donor_id)
** The A,B,C.. dimensions should probably use [http://genome.ucsc.edu/goldenPath/help/trackDb/trackDbHub.html#filterComposite filterComposite] (e.g. filterComposite dimA)
** Each dimension and views should be represented in [http://genome.ucsc.edu/goldenPath/help/trackDb/trackDbHub.html#sortOrder sortOrder], ideally in order of dimX, dimY, dimA,B,C, view (e.g. sortOrder cell_type=+ mark=+ donor_id=+ view=+). But the hub user may wish for a different sortOrder, which is fine.
** Tags of subGroup/dimension should be short and sweet with no special chars. Also labels can have HTML codes embedded (e.g. NOT CPG_methylation_%=CPG_methylation_% RATHER mpct=CPG_methylation_&_#37)
** Never represent the same [http://genome.ucsc.edu/goldenPath/help/trackDb/trackDbHub.html#subGroups subgroup] in both view and as a dimension (e.g. NOT dimensions dimX=view). For that matter a subgroup should never be in two dimensions (e.g. NOT dimensions dimX=cell dimY=mark dimA=cell). The composite will appear to function but multiple ways of selecting the same thing will create a confusing and inconsistent User Interface.
 
=== Super tracks ===
 
Extremely large hubs may use [http://genome.ucsc.edu/goldenPath/help/trackDb/trackDbHub.html#superTrack superTracks] as well to achieve a meaningful hierarchy. Super tracks can be used to group together any type of related tracks; for example, you could combine a multiWig, a composite and a bigBed track together into a single superTrack.
 
== Track display recommendations ==
* Avoid setting a composite track and all of its subtracks to the same visibility.  When you have composite tracks that are hidden by default, it is best to still designate some subtracks to display when the composite track is turned on (visibility dense, versus the default of hide).  This provides an example of your track data to users who turn on your composite track. If no subtracks are turned on by default, a user who changes your composite track visibility to "show" won't see anything.
* The [http://genome.ucsc.edu/goldenPath/help/trackDb/trackDbHub.html#shortLabel shortLabel] text should be under 17 characters, or meaningful information may be cut off from display when tracks are set to "dense" visibility.
* The [http://genome.ucsc.edu/goldenPath/help/trackDb/trackDbHub.html#shortLabel shortLabel] text should be under 17 characters, or meaningful information may be cut off from display when tracks are set to "dense" visibility.
* The length for a [http://genome.ucsc.edu/goldenPath/help/trackDb/trackDbHub.html#longLabel longLabel] should be limited to around 75 characters.
* The length for a [http://genome.ucsc.edu/goldenPath/help/trackDb/trackDbHub.html#longLabel longLabel] should be limited to around 75 characters.
* It is best to avoid setting a composite track and all of the corresponding subtracks to the same visibility.  When you have composite tracks that are hidden by default, it is best to still designate some subtracks to display when the composite track is turned on (visibility dense, versus the default of hide).  This provides an example of your track data to users who turn on your composite track.  If no subtracks are turned on by default, a user who changes your composite track visibility to "show" won't see anything."
* If you are making an assembly hub, you will want to add a gateway page for each assembly  by having a ''htmlPath'' line for each genome not in the Browser in genomes.txt. [http://genomewiki.ucsc.edu/index.php/Assembly_Hubs Assembly Hubs Wiki]


== Guidelines for Composites:==
== Track description page recommendations ==
* The description page should preferably contain UCSC's standard ''Description'', ''Methods'', ''Contacts''... sections as defined [http://genome.ucsc.edu/goldenPath/help/trackDb/trackDbHub.html#commonSettings here under "html"] and here is [http://genome.ucsc.edu/goldenpath/help/examples/hubExamples/templatePage.html an example template].
** Here are some examples of well done track description pages from various public hubs:
*** [https://genome.ucsc.edu/cgi-bin/hgTrackUi?db=susScr3&hubUrl=http://public.hpcagrogenomics.wur.nl/ABGC/Track_Hubs/Chankyu/hub.txt&g=hub_141191_Liver_RRBS Liver DNA Methylation] track in the Porcine DNA methylation hub - provides a nice example of how you can use colors on your description pages
*** [https://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg38&hubUrl=http://apprisws.bioinfo.cnio.es/trackHub/hub.txt&g=hub_67585_3DStructureAPPRIS APPRIS - protein structural information] track in the Principal Splice Isoforms APPRIS hub - provides and example how you might integrate images into your track description pages and use the same description page for multiple tracks
* Your track description pages should provide meaningful documentation for your tracks
** If you are creating a hub based on a paper, use the paper's abstract as a starting point for your track's ''Description'' section
** The ''Methods'' section expand upon the overview of the ''Description'' section and provide more details about how the data for the track was produced
** You should assume a broad audience of students and researchers will use your hubs. You should spell out common acronyms for those who may be new to genomics. For example, you might write out a term and it’s acronym as follows “Fluorescent in situ hybridization (FISH)” which spells it out and then provides the acronym that you can use throughout the rest of your description page.
* It might be a good idea to include a “Data Access” section on your track description page which describes how to access the data in your hub and where to download the raw data for the tracks in your hub. You can see some examples of “Data Access” sections on the [https://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg38&g=refSeqComposite NCBI RefSeq] and [https://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg38&g=spUniprot UniProt] track description pages.
 
== Miscellaneous recommendations ==
* Please note that hosting hub files on HTTP tends to work even better than FTP because of the difference in the number of open tcp connections needed.
* The use of ''metadata lines'' can be supported, users need to be well aware that '''support may be replaced''' by another system in the future.
* Create a [https://genome.ucsc.edu/cgi-bin/hgPublicSessions Public Session] that highlights the different data available in your hub in a biologically interesting area of the genome. Be sure to include a "Description" for your session. More about sessions can be found [https://genome.ucsc.edu/goldenPath/help/hgSessionHelp.html here].
 
= Connection issues? =
 
Sometimes the servers hosting public hubs will experience administrative changes and no longer successfully serve up hub files.  In most cases it is likely that new firewalls are limiting the access at the institution and are causing these connection problems. One can please ask their institution's admins to add this IP range as exceptions that are not limited:
 
These IP addresses are currently used by official genome browser mirrors:
 
* 128.114.119.* = genome.ucsc.edu
* 129.70.40.99 = european mirror, genome-euro.ucsc.edu
* 134.160.84.67 = asian mirror, genome-asia.ucsc.edu
* 128.114.198.32 = genome-test.gi.ucsc.edu, used by developers and for debugging
 
Although our site is creating many requests to an institution, each is small and quickly satisfied by the server, so the total load on your webserver should be limited and system administrators will likely not have an issue with adding this exception.


* Have [http://genome.ucsc.edu/goldenPath/help/trackDb/trackDbHub.html#view multi-view] only when there is more than one view. Views ideally give alternate access to the same data (e.g. signals and called peaks). Keep in mind that the value of views is that they allow for more than one data/configuration type (e.g. bigBed and bigWig) in a single composite. All subtracks of a view must have the same data type. Likewise, all subtracks of a non-multi-view composite must be the same type.
= Public Hub Examples =
* Never represent the same [http://genome.ucsc.edu/goldenPath/help/trackDb/trackDbHub.html#subGroups subgroup] in both view and as a dimension (e.g. NOT dimensions dimX=view). For that matter a subgroup should never be in two dimensions (e.g. NOT dimensions dimX=cell dimY=mark dimA=cell). The composite will appear to function but multiple ways of selecting the same thing will create a confusing and inconsistent User Interface.
 
Many of the [http://genome.ucsc.edu/cgi-bin/hgHubConnect public hubs] in the Genome Browser provide excellent examples or templates for creating your own hub! As a reference for interpreting trackDb.txt lines used in these example hubs, please refer to the ''Hub Track Database Definition'' [http://genome.ucsc.edu/goldenPath/help/trackDb/trackDbHub.html#toc glossary].
 
Some ''Hub Track Database Definition'' settings like [https://genome.ucsc.edu/goldenpath/help/hubQuickStartFilter.html filters] have additional help documentation. Also note that if you are only displaying one genome you can use the [https://genome.ucsc.edu/goldenPath/help/hgTracksHelp.html#UseOneFile useOneFile on] setting.
 
== Example Track Hubs ==
 
=== Example 1 ===
 
The [http://genome.ucsc.edu/cgi-bin/hgTracks?db=hg38&hubUrl=http://apprisws.bioinfo.cnio.es/trackHub/hub.txt Principal Splice Isoforms APPRIS hub] provides a good example of basic hub that includes a few different annotation tracks. Each track includes its own description page and is colored in such a way that distinguishes it from the other tracks in the hub and native track in the UCSC Genome Browser.
 
Here some links to their configuration files and some description pages:
* [http://apprisws.bioinfo.cnio.es/trackHub/hub.txt hub.txt]
* [http://apprisws.bioinfo.cnio.es/trackHub/genomes.txt genomes.txt]
* [http://apprisws.bioinfo.cnio.es/trackHub/trackDb.hg38.txt trackDb.txt] for the default hub assembly, hg38
* Description page for [http://apprisws.bioinfo.cnio.es/trackHub/docs/APPRIS.html APPRIS - Principal Isoforms] track (see track for hg38 in the Genome Browser [http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg38&hubUrl=http://apprisws.bioinfo.cnio.es/trackHub/hub.txt&g=hub_67585_PrincipalIsoformsAPPRIS here])


==Guidelines for Using Dimensions:==
=== Example 2 ===
* There should be no [http://genome.ucsc.edu/goldenPath/help/trackDb/trackDbHub.html#dimensions dimensions] with a single entry (do not have only one cell line represented in dimX=cell), unless data growth is expected to fill in additional entries.
* Using only one dimension: preferably use dimX (e.g. dimensions dimX=cell). This saves vertical User Interface space, but is not always the best choice.
* Using two dimensions: use dimX and dimY (e.g. dimensions dimX=cell dimY=mark)
* Using more than two: use dimX, dimY on the most important dimensions. Then use dimA,B,C as needed on lesser dimensions. (e.g. dimensions dimX=cell dimY=mark dimA=donor_id)
* The A,B,C.. dimensions should probably use [http://genome.ucsc.edu/goldenPath/help/trackDb/trackDbHub.html#filterComposite filterComposite] (e.g. filterComposite dimA)
* Each dimension and views should be represented in [http://genome.ucsc.edu/goldenPath/help/trackDb/trackDbHub.html#sortOrder sortOrder], ideally in order of dimX, dimY, dimA,B,C, view (e.g. sortOrder cell_type=+ mark=+ donor_id=+ view=+). But the hub user may wish for a different sortOrder, which is fine.
* Tags of subGroup/dimension should be short and sweet with no special chars. Also labels can have HTML codes embedded (e.g. NOT CPG_methylation_%=CPG_methylation_% RATHER mpct=CPG_methylation_&_#37)


==Miscellaneous Guidelines:==
The [http://genome.ucsc.edu/cgi-bin/hgTracks?db=hg19&hubUrl=http://vizhub.wustl.edu/VizHub/RoadmapIntegrative.txt Roadmap Epigenomics Integrative Analysis Hub] provides a great example of how you might use organize your track if you have thousands of different tracks. The hub uses composites with dimensions to organize thousands of different tracks across a number of cell lines and the uses supertracks to group these tracks even further.
*The use of ''metadata lines'' can be supported, users need to be well aware that '''support may be replaced''' by another system in the future.


=Public Hub Examples=
Here some links to their configuration files and some description pages:
* [http://vizhub.wustl.edu/VizHub/RoadmapIntegrative.txt hub.txt] named “RoadmapIntegrative.txt”
* [http://vizhub.wustl.edu/VizHub/roadmapintegrativeall.txt genomes.txt] named “roadmapintegrativeall.txt”
* [http://vizhub.wustl.edu/VizHub/hg19/roadmap_both_02182015_trackDb.txt trackDb.txt] named “roadmap_both_02182015_trackDb.txt” for hg19


The browser's [http://genome.ucsc.edu/cgi-bin/hgHubConnect public hubs] provide excellent resources to see how others have created hub structures.  As a reference for interpreting trackDb.txt lines use the ''Hub Track Database Definition'' [http://genome.ucsc.edu/goldenPath/help/trackDb/trackDbHub.html#toc glossary]. For an example of hub configuration and documentation, one example is the [http://genome.ucsc.edu/cgi-bin/hgTracks?db=hg19&hubUrl=http://ftp.ebi.ac.uk/pub/databases/ensembl/encode/integration_data_jan2011/hub.txt ENCODE Analysis hub]:
== Example Assembly Hub ==


http://ftp.ebi.ac.uk/pub/databases/ensembl/encode/integration_data_jan2011/hub.txt
The [http://genome.ucsc.edu/cgi-bin/hgTracks?genome=CB4856Princeton_JR-contig&hubUrl=http://waterston.gs.washington.edu/trackhubs/isolates/hub.txt C elegans isolates hub] provides an excellent example of your assembly hub could look like. The hub creators provide a detailed description page for each assembly, many different annotations tracks each with their own description page, and clearly defined track groups with those related tracks grouped together.
http://ftp.ebi.ac.uk/pub/databases/ensembl/encode/integration_data_jan2011/genomes.txt
http://ftp.ebi.ac.uk/pub/databases/ensembl/encode/integration_data_jan2011/hg19/trackDb.txt
http://ftp.ebi.ac.uk/pub/databases/ensembl/encode/integration_data_jan2011/hg19/uniformTfbs.html
http://ftp.ebi.ac.uk/pub/databases/ensembl/encode/integration_data_jan2011/hg19/uniformRNA.html


Regarding creating meaningful html documentation, if you are creating a hub based on a paper, we suggest the paper's abstract as a useful start for your track's ''Description'' section. The ''Methods'' section should have more detail, and please include a contact for questions. Lastly, it is best to assume a broad audience of students as well as researchers. For example, it is best to spell out common acronyms for those who may be new to genomics.
Here some links to their configuration files and some description pages:
* [http://waterston.gs.washington.edu/trackhubs/isolates/hub.txt hub.txt]
* [http://waterston.gs.washington.edu/trackhubs/isolates/genomes.txt genomes.txt]
* [http://waterston.gs.washington.edu/trackhubs/isolates/CB4856Princeton_JR-contig/trackDb.txt trackDb.txt] for the primary genome in the hub, CB4856Princeton_JR-contig
* [http://waterston.gs.washington.edu/trackhubs/isolates/CB4856Princeton_JR-contig/groups.txt groups.txt] that defines track groups for CB4856Princeton_JR-contig
* Description pages for CB4856Princeton_JR-contig
** [http://waterston.gs.washington.edu/trackhubs/isolates/CB4856Princeton_JR-contig/description.html Assembly gateway] (see gateway page [https://genome.ucsc.edu/cgi-bin/hgGateway?genome=CB4856Princeton_JR-contig&hubUrl=http://waterston.gs.washington.edu/trackhubs/isolates/hub.txt here])
** [http://waterston.gs.washington.edu/trackhubs/isolates/CB4856Princeton_JR-contig/Rajewsky.description.html Rajewsky Mixed Stage RNAseq] (see track [http://genome.ucsc.edu/cgi-bin/hgTrackUi?genome=CB4856Princeton_JR-contig&hubUrl=http://waterston.gs.washington.edu/trackhubs/isolates/hub.txt&g=hub_17367_Rajewsky here])
** [http://waterston.gs.washington.edu/trackhubs/isolates/CB4856Princeton_JR-contig/blat_N2_cDNA_models.description.html WS230 cDNA blat Annotations] (see track [http://genome.ucsc.edu/cgi-bin/hgTrackUi?genome=CB4856Princeton_JR-contig&hubUrl=http://waterston.gs.washington.edu/trackhubs/isolates/hub.txt&g=hub_17367_blat_N2_cDNA_models here])

Revision as of 21:36, 15 September 2021

This page is intended to lay out guidelines for those who are trying to create Public Hubs. If you’ve created a hub that you feel meets these requirements and is of general interest to the research community, please contact us at genome-www@soe.ucsc.edu to have it added to the list.

(As a reference for interpreting trackDb.txt settings use the Hub Track Database Definition glossary)

Required Guidelines

The following guidelines must be met before your hub will be added to our public list:

  • Required for both track and assembly hubs:
    • You MUST have a description page for every configuration page (composite, superTrack or stand alone track). Note that multiple tracks and/or composites can use the same description page with the “html” setting. You can find more information on creating track description pages in the Track description page recommendations section below.
    • All of your description pages MUST have a contact email address prominently displayed.
    • Try to have no more than 10 tracks with visibility set to display (in full, pack, dense, or squish) as default upon first connecting your hub.
    • A descriptionUrl html page specified in your hub.txt. This should be a URL to a description page for your entire hub, often public hubs will link to a full-text paper or to their laboratory webpage that describes the research presented in the hub. These links are presented on the Public Hubs page as a hyperlink on the longLabel presented in the hub.txt, while the shortLabel is a hyperlink to the hub.txt location.
  • Required for only assembly hubs:
    • Add a gateway page for each assembly by having a htmlPath line for each genome not already hosted by UCSC in genomes.txt. Assembly Hubs Wiki
    • The following settings should properly set in your genomes.txt (The last 3 settings will make it easier to find assembly hub species in hgGateway by UI search):
      • defaultPos
      • scientificName
      • organism
      • description

Recommended Guidelines

These guidelines in the following sections are recommended to improve user experience, but are not required to be implemented before the hub is added to our list of Public Hubs.

Track organization recommendations

Related tracks can be grouped in a few different ways, namely superTracks, multiWigs, and composites. If your hub includes a large number of tracks, the grouping of tracks may be necessary. This will prevent your hub’s track group from being an overwhelming mess of individual tracks and can make user configuration of your tracks easier.

Composite tracks

Related tracks of the same data type (e.g. a set of related bigBed tracks) should be combined into composites where appropriate.

  • Have multi-view only when there is more than one view. Views ideally give alternate access to the same data (e.g. signals and called peaks). Keep in mind that the value of views is that they allow for more than one data/configuration type (e.g. bigBed and bigWig) in a single composite. All subtracks of a view must have the same data type. Likewise, all subtracks of a non-multi-view composite must be the same type.
  • Recommendations for using dimensions with your composite tracks:
    • There should be no dimensions with a single entry (do not have only one cell line represented in dimX=cell), unless data growth is expected to fill in additional entries.
    • Using only one dimension: preferably use dimX (e.g. dimensions dimX=cell). This saves vertical User Interface space, but is not always the best choice.
    • Using two dimensions: use dimX and dimY (e.g. dimensions dimX=cell dimY=mark)
    • Using more than two: use dimX, dimY on the most important dimensions. Then use dimA,B,C as needed on lesser dimensions. (e.g. dimensions dimX=cell dimY=mark dimA=donor_id)
    • The A,B,C.. dimensions should probably use filterComposite (e.g. filterComposite dimA)
    • Each dimension and views should be represented in sortOrder, ideally in order of dimX, dimY, dimA,B,C, view (e.g. sortOrder cell_type=+ mark=+ donor_id=+ view=+). But the hub user may wish for a different sortOrder, which is fine.
    • Tags of subGroup/dimension should be short and sweet with no special chars. Also labels can have HTML codes embedded (e.g. NOT CPG_methylation_%=CPG_methylation_% RATHER mpct=CPG_methylation_&_#37)
    • Never represent the same subgroup in both view and as a dimension (e.g. NOT dimensions dimX=view). For that matter a subgroup should never be in two dimensions (e.g. NOT dimensions dimX=cell dimY=mark dimA=cell). The composite will appear to function but multiple ways of selecting the same thing will create a confusing and inconsistent User Interface.

Super tracks

Extremely large hubs may use superTracks as well to achieve a meaningful hierarchy. Super tracks can be used to group together any type of related tracks; for example, you could combine a multiWig, a composite and a bigBed track together into a single superTrack.

Track display recommendations

  • Avoid setting a composite track and all of its subtracks to the same visibility. When you have composite tracks that are hidden by default, it is best to still designate some subtracks to display when the composite track is turned on (visibility dense, versus the default of hide). This provides an example of your track data to users who turn on your composite track. If no subtracks are turned on by default, a user who changes your composite track visibility to "show" won't see anything.
  • The shortLabel text should be under 17 characters, or meaningful information may be cut off from display when tracks are set to "dense" visibility.
  • The length for a longLabel should be limited to around 75 characters.

Track description page recommendations

  • The description page should preferably contain UCSC's standard Description, Methods, Contacts... sections as defined here under "html" and here is an example template.
    • Here are some examples of well done track description pages from various public hubs:
      • Liver DNA Methylation track in the Porcine DNA methylation hub - provides a nice example of how you can use colors on your description pages
      • APPRIS - protein structural information track in the Principal Splice Isoforms APPRIS hub - provides and example how you might integrate images into your track description pages and use the same description page for multiple tracks
  • Your track description pages should provide meaningful documentation for your tracks
    • If you are creating a hub based on a paper, use the paper's abstract as a starting point for your track's Description section
    • The Methods section expand upon the overview of the Description section and provide more details about how the data for the track was produced
    • You should assume a broad audience of students and researchers will use your hubs. You should spell out common acronyms for those who may be new to genomics. For example, you might write out a term and it’s acronym as follows “Fluorescent in situ hybridization (FISH)” which spells it out and then provides the acronym that you can use throughout the rest of your description page.
  • It might be a good idea to include a “Data Access” section on your track description page which describes how to access the data in your hub and where to download the raw data for the tracks in your hub. You can see some examples of “Data Access” sections on the NCBI RefSeq and UniProt track description pages.

Miscellaneous recommendations

  • Please note that hosting hub files on HTTP tends to work even better than FTP because of the difference in the number of open tcp connections needed.
  • The use of metadata lines can be supported, users need to be well aware that support may be replaced by another system in the future.
  • Create a Public Session that highlights the different data available in your hub in a biologically interesting area of the genome. Be sure to include a "Description" for your session. More about sessions can be found here.

Connection issues?

Sometimes the servers hosting public hubs will experience administrative changes and no longer successfully serve up hub files. In most cases it is likely that new firewalls are limiting the access at the institution and are causing these connection problems. One can please ask their institution's admins to add this IP range as exceptions that are not limited:

These IP addresses are currently used by official genome browser mirrors:

  • 128.114.119.* = genome.ucsc.edu
  • 129.70.40.99 = european mirror, genome-euro.ucsc.edu
  • 134.160.84.67 = asian mirror, genome-asia.ucsc.edu
  • 128.114.198.32 = genome-test.gi.ucsc.edu, used by developers and for debugging

Although our site is creating many requests to an institution, each is small and quickly satisfied by the server, so the total load on your webserver should be limited and system administrators will likely not have an issue with adding this exception.

Public Hub Examples

Many of the public hubs in the Genome Browser provide excellent examples or templates for creating your own hub! As a reference for interpreting trackDb.txt lines used in these example hubs, please refer to the Hub Track Database Definition glossary.

Some Hub Track Database Definition settings like filters have additional help documentation. Also note that if you are only displaying one genome you can use the useOneFile on setting.

Example Track Hubs

Example 1

The Principal Splice Isoforms APPRIS hub provides a good example of basic hub that includes a few different annotation tracks. Each track includes its own description page and is colored in such a way that distinguishes it from the other tracks in the hub and native track in the UCSC Genome Browser.

Here some links to their configuration files and some description pages:

Example 2

The Roadmap Epigenomics Integrative Analysis Hub provides a great example of how you might use organize your track if you have thousands of different tracks. The hub uses composites with dimensions to organize thousands of different tracks across a number of cell lines and the uses supertracks to group these tracks even further.

Here some links to their configuration files and some description pages:

  • hub.txt named “RoadmapIntegrative.txt”
  • genomes.txt named “roadmapintegrativeall.txt”
  • trackDb.txt named “roadmap_both_02182015_trackDb.txt” for hg19

Example Assembly Hub

The C elegans isolates hub provides an excellent example of your assembly hub could look like. The hub creators provide a detailed description page for each assembly, many different annotations tracks each with their own description page, and clearly defined track groups with those related tracks grouped together.

Here some links to their configuration files and some description pages: