Opsin evolution: key critters (ecdysozoa): Difference between revisions

From genomewiki
Jump to navigationJump to search
Line 101: Line 101:
Here the [http://www.jneurosci.org/cgi/content/full/23/34/10873?ijkey=e7ab51dd492a926126b6a0c29f6ae851d225227d remarkable observation] in 2003 that a single lysine K90 (bovine rhodopsin numbering G90) suffices to define the phylogenetically valid class of ultraviolet opsins. Six years later, despite a vastly expanded data set, there is still near-perfect concordance of spectrophotometry, behavioral studies, alignment clustering, indel signature, intronic structure, and possession of lysine (rarely glutamate) at this position. This residue was previously known to be  important to spectral tuning from bird C90S ultraviolet vision and human rhodopsin G90D night blindness.  
Here the [http://www.jneurosci.org/cgi/content/full/23/34/10873?ijkey=e7ab51dd492a926126b6a0c29f6ae851d225227d remarkable observation] in 2003 that a single lysine K90 (bovine rhodopsin numbering G90) suffices to define the phylogenetically valid class of ultraviolet opsins. Six years later, despite a vastly expanded data set, there is still near-perfect concordance of spectrophotometry, behavioral studies, alignment clustering, indel signature, intronic structure, and possession of lysine (rarely glutamate) at this position. This residue was previously known to be  important to spectral tuning from bird C90S ultraviolet vision and human rhodopsin G90D night blindness.  


This residue sits [[Opsin_evolution:_ancestral_sequences#Landmarks_along_the_opsin_protein|deep within transmembrane helix 2]] as illustrated by the alignment at bottom. That hydrophobic mileau is unworkable energetically for a charged residue unless a compensatory counterion exists. That negatively charged residue is presumably the ancestral counterion, negatively charged E171 (rather than the E113 of vertebrate ciliary opsins) or for positively charged residue, the Schiff base itself. K90, by taking E171 away from the Schiff base lysine K296, has the effect of leaving that protonated, an effect known to shift adsorption into the ultraviolet.
This residue sits [[Opsin_evolution:_ancestral_sequences#Landmarks_along_the_opsin_protein|deep within transmembrane helix 2]] as illustrated by a [[Opsin_evolution:_informative_indels#Alignment_in_TM2_region:_411_opsins|massive alignment]] of this region. That hydrophobic mileau is unworkable energetically for a charged residue unless a compensatory counterion exists. That negatively charged residue is presumably the ancestral counterion, negatively charged E171 (rather than the E113 of vertebrate ciliary opsins) or for positively charged residue, the Schiff base itself. K90, by taking E171 away from the Schiff base lysine K296, has the effect of leaving that protonated, an effect known to shift adsorption into the ultraviolet.


Observe however that opsins specialized to blue (not ultraviolet) are also satisfactorily classified in this same region (June 2009 current alignment below). These opsins have some other residue than lysine at 90 but share a one-residue deletion near the lysine that would shift its orientation relative to the chromophore as well as a proline six residues after the DRY motif, which is glycine in all other ecdyozoan opsins. This agrees with conventional phylogenetic alignment that sisters blue and ultraviolet opsins to the exclusion of long wavelength and blue-green opsins as well as to the more basal BcRh opsins [http://www.ncbi.nlm.nih.gov/pubmed/9318091 operationally defined] by clustering to two particular opsins from the crab Hemigrapsus sanguineus.  
Observe however that opsins specialized to blue (not ultraviolet) are also satisfactorily classified in this same region (June 2009 current alignment below). These opsins have some other residue than lysine at 90 but share a one-residue deletion near the lysine that would shift its orientation relative to the chromophore as well as a proline six residues after the DRY motif, which is glycine in all other ecdyozoan opsins. This agrees with conventional phylogenetic alignment that sisters blue and ultraviolet opsins to the exclusion of long wavelength and blue-green opsins as well as to the more basal BcRh opsins [http://www.ncbi.nlm.nih.gov/pubmed/9318091 operationally defined] by clustering to two particular opsins from the crab Hemigrapsus sanguineus.  

Revision as of 21:23, 12 December 2009

Key Critters: introduction to genome projects opsins

Some species such as Drosophila have lost all ciliary opsins -- clearly this class of genes is not essential for a successful visually complex flying insect with 5-color vision, periferal motion detection, polarized light capability and circadian rhythm (as one might have assumed from vertebrates). Other protostome lineages such as nematodea (eg Caenorhabditis elegans) function successfully without any vision at all, making this 'model organism' completely irrelevent to the evolutionary study of vision.

However bees, annelids, and mammals retain ciliary opsins so it follows -- pervasive, detailed convergence at the molecular level being impossible -- this must be the ancestral bilateran state state. In turn that suggests ciliary opsins in cnidaria and indeed that has been recently established in the lensing eye.

When the eye is reduced to a single pigment cell backing a single photoreceptor cell, the opsin of that species may be expressed only in one cell of the entire body. In this situation, the opsin may never show up in transcript collections, even with subtraction of common ones. One sees the importance of complete genomes here (versus transcripts or immunostained sections alone): absence of ciliary opsin evidence in a genome is truly evidence of ciliary opsin absence.

Vertebrates could never have evolved ciliary opsin vision had the bilateran ancestor possessed the limited opsin repertoire of fruit fly. Thus the most pressing question is -- assuming rhabdomeric opsins were thoroughly entrenched in the earliest bilateran imaging eyes and photoreception systems -- what kept ciliary opsins around in early bilatera? Recall early diverging deuterostomes (xenoturbellids, urchins, acorn worms, tunicates, and lancelets) lack imaging vision -- that emerged in full modern form on the lamprey stem.

Conversely, assuming cnidaria use ciliary opsins, what kept rhabdomeric opsins around so that they could later be co-opted by protostomes for their form of opsin-based vision? Evolution is strictly 'use it or lose it' over these time frames. Here cnidaria, or at least their larva, may also use rhabdomeric opsins. It seems that both classes of opsins have retained roles in most species, but very different classes were promoted to the imaging role in different branches of Bilatera. In fly, ciliary opsins have winked out; in nematode, both ciliary and rhabdomeric opsins are gone. While irrevocable, these losses would scarcely receive comment in non-model organisms.

It's important to understand contemporary representatives of early diverging species (relative to the sequence of divergence nodes leading to human) are not archaic failed experiments nor primitive living fossils frozen in evolutionary time. Quite the contrary, all surviving extant species are equally successful and fully modern -- the tree of life is right-justified. Indeed their genes, regulatory signalling systems, and enzymes may be more finely honed than slowly evolving mammals because of more rapid evolution attributable to larger effective population sizes, reproductive mode, short generation time, and marine selective predatory pressures.

However we can still hope that ancestral character traits will still be reflected to some extent in these earlier diverging species and that with enough complete opsin repertoires from taxonomically appropriate species, ancestral genes and even whole visual systems can be reconstructed at key ancestral nodes on the phylogenetic tree. The story describing the evolution of the human eye then amounts to describing its status at these successive nodes with perhaps interpolative speculation between them. Definitely limits to knowledge exist because living metazoans provide only 35 nodes between sponge and human -- gaps between nodes may average 30 million years but can greatly exceed that (eg 135 myr between bird and platypus). This is offset by the occasional proposal for new deuterostome branches (Xenoturbella, Convoluta) or basal metazoan (Ctenophores.

The ideal set of genomes needed to study the evolution of the metazoan eye is only partly completed, underway, or not even proposed yet. In some cases, the genome size of clade representatives is so large (eg lungfish at 25x human) the species may never be sequenced, though satisfactory opsin transcripts could still be obtained. In others, the rate of evolution has been so fast so long that very little information about photoreception at ancestral nodes has been retained (eg the tunicate Oikopleura). Hagfish opsins, which would conveniently break up the crucial lamprey long branch, are not available at GenBank but here the animal has adopted a deep water habitat, meaning that its cone opsin genomic repertoire will be highly reduced, if not gone entirely, in its markedly degenerated eyes, though whatever remains of its opsins could still be informative.

MoreBilatGenes.png

The impact of adding more genomes is to uncover more genes of the common bilateran ancestor that were masked by lineage-specific losses. Recall the beatle genome Tribolium uncovered 126 additional genes absent in other insect genomes but nonetheless present in human. Humans themselves of course have lost hundreds of genes even relative to the first land animal, so here too we need to pool mammalian and amniote gene pools to reconstruct that ancestor.

Model organism choices do not always coincide with genome sequenceability, transcriptome projects, nor (worst of all) with more slow-evolving and less derived species. Finally, most sequencing speaks to narrow anthropocentric interests, whereas the sequencing need more broadly conceived is greatest farther back (to break up long branches). The evolution of the eye needs a rather different portfolio of genomes than a typical human disease gene because of the earlier intrinsic timing of the innovative events. In fact, one product of the investigation here is to spell out these needed genomes. Of course one obvious genome choice are cubomedusan jellyfish with their 24 eyes of 6 types.

It's worth reviewing genome status and recent experimental literature on key species. While abstracts are readily available at PubMed, access to free full text is unpredictable, so those links are collected when available. It suffices to reference only recent articles because those in turn cite the earlier literature and citation in turn of their paper are collected by Google Scholar (or AbstractsPlus at PubMed). Most opsin sequences in the Opsin evolution reference sequence collection have a PubMed accession as a field in their fasta header database; those can simply be compiled to an active link that opens all of them in one PubMed window.


OpsineyePhylo.png


Figure adapted from: Acoel Flatworms Are Not Platyhelminthes: Evidence from Phylogenomics (H Philippe et al PLoS ONE. 2007 Aug 8)


Deuterostomes moved to separate article

Key critters have been broken down into 4 smaller articles -- deuterostomes are now here.

Chondrichthyes: Callorhinchus milii (elephantshark)         13 opsins
Agnatha:        Petromyzon marinus (lamprey)                 9 opsins
Agnatha:        Eptatretus burgeri (hagfish)                 0 opsins
Urochordata:    Ciona intestinalis (tunicate)                4 opsins
Echinodermata:  Stronglyocentrotus purpuratus (sea urchin)   6 opsins
Hemichordata:   Saccoglossus kowalevskii (acornworm)         1 opsin
Deuterostomia:  Xenoturbella bocki + Convoluta pulchra       0 opsins

Lophotrochozoa moved to separate article

The Protostoma section has gotten too large. The seven Lophotrochozoa with genome projects are now here.

2.1 Annelida: Platynereis dumerilii (ragworm) .. 3 opsins
2.2 Annelida: Capitella sp (marine worm) .. 2 opsins
2.3 Annelida: Helobdella robusta (leech) .. 2 opsins
2.4 Mollusca: Aplysia californica (sea hare).. 2 opsins
2.5 Mollusca: Lottia gigantea (limpet) .. 2 opsin
2.6 Platyhelminthes: Schmidtea mediterranea (planaria) .. 1 opsin
2.7 Platyhelminthes: Schistosoma mansoni (trematode) .. 3 opsins

Cnidaria and Porifera moved to separate article

The key critter article has gotten too large -- cnidaria are now here.

Cubozoa: Tripedalia cystophora .. 1 ciliary opsin
Cubozoa: Carybdea marsupialis (jellyfish) .. probable opsins
Anthozoa: Nematostella vectensis (sea anemone) .. claimed opsins
Hydrozoa: Hydra magnipapillata (hydra) .. claimed opsins
Hydrozoa: Cladonema radiatum (jellyfish) .. claimed opsins

Porifera, Placozoa, Choanoflagellates .. 0 opsins

Ecdysozoa: 79 opsins

This clade includes insects and other arthropods but not molluscs and annelids (lophotrochozoa). The focus here is on species with genome projects that allow complete opsin repertoires to be determined, as supplemented by annotation transfer from experimental species when 1:1 orthology can be established.

Genome projects have not sampled ecysozoan phylogenetic diversity evenly to date but that may change as small genomes can be rapidly sequenced today. Studies of photoreception in non-genome species are limited by their inevitably incomplete repertoire of sequenced opsins and companion genes. Opsins in genomic species have determinable intron positions and phases and flanking genes so better prospects for inference of accurate descendent relationships.

DrosOpsin.jpg

An immense amount of experimental work on Drosophila melanogaster, recently reviewed from an evolutionary perspective, provides an excellent understanding of the evolutionary history underlying regulatory genetics, biochemistry, developmental and structural homologenization of opsin expression across larval Bolwig organs and adult ocelli and eye.

While annotation transfer to the other 11 fruit fly genome projects is largely justified, that becomes problematic even across Insecta because of gene loss in drosophilids (notably all ciliary opsins), lineage-specific tandem expansion of opsin multiplicities and the necessary rationales for their retention, derived conditions, and better representation of ancestral characteristics in other species. It will prove very difficult even to get at ancestral dipteran vision starting from Drosophila. Yet species with simpler vision like Tribolium are no living fossils either, having lost opsins.

Imaging vision in ecdysozoa (and lophotrochozoa) is quite different from the chordate system, with rhabodomeric opsins residing in specialized microvilli rather than ciliary opsins in modified cilia. The signalling system and chromophore regeneration also represent substantial departures. At first there seems no common ground for a shared Ur-bilateran ancestor -- which signalling system was originally used for imaging vision and which lineage displaced it with the other? Some protostomes still utilize ciliary opsins in non-imaging photoreception and similarly some deuterostomes still utilize rhabodomeric opsins. Since the relevent opsin gene trees coalesce far earlier, this proves Ur-bilatera possessed both opsin classes (without clarifying which system was used for imaging vision, if either).

Blastp of any rhabdomeric opsin from any protostome against the set of all deuterostome opsins invariably gives vertebrate melanopsins as best match, whereas blastp of any protostome ciliary opsin (pteropsin) always has best match to TMTs (ancestral form of encephalopsin). That is, from the biomedical perspective, rhabdomeric opsins are just a clade-specific expansion of melanopsins largely irrelevent to human vision. Similarly invertebrate ciliary opsins not used in imaging vision primarily inform us on deeper ancestral origin issues. Note melanopsin and TMT are not orthologs at the level of Ur-bilateran nor even Ur-eumetazoan because gene duplication and divergence preceded the cnidarian last common ancestor.

The nature of vision at ancestral nodes has not yet been resolved, in part because pre-bilateran cnidaria photoreceptors studied so far as outgroup have been either ciliary, or based on distantly related cnidarian-specific opsins, or in the case of coral melanopsin, genomic sequence not yet associated with photoreception. In the Ur-eumetazoan common ancestor, this could imply ciliary opsin imaging vision, no imaging vision but convergent evolution (later independent invention) in the box jellyfish lineage, or even rhabdomeric imaging vision with subsequent displacement by ciliary opsins in cubomedusa and separately in later deuterostomes. Sponge larva presumably also utilize a ciliary opsin but here again it is unclear whether later metazoan use a system descendant from that.

It's sometimes asserted that imaging vision systems (all highly dissipative of ATP) were first enabled in the rapidly oxygenating Cambrian ocean, yet near-simultaneity is not a good fit to the arthropod fossil record (stalked eyes) nor molecular reconstructions. For example, extant representatives of early diverging deuterostomes (xenoturbella, acornworms, echinoderms, tunicates, amphioxus) all lack imaging vision (depending on how that is defined in scanning larva), so it seems clear that early arthopods had well-developed vision prior to the emergence of hagfish/lamprey. The majority of extant animal phyla have prospered for 540 myr without ever developing imaging vision.

Ecdysozoa .. opsin repertoire of the last common ancestor

Questions of ancestral opsin repertoires and their implied photoreception biology are best addressed after careful step-by-step reconstruction of opsin repertoires in each of the relevent lineages, exhausting available information in extant species rather than just add to a century of speculation. These reconstructions can help evaluate candidates for 'living fossils'. Thus the focus in this section is reconstruction of the opsin repertoire of just the last common ancestor of ecdysozoa. That can be combined later with parallel efforts on ancestral lophotrochozoa and deuterostomes opsins to get closer to the Urbilateran.

EcdPhyl.jpg

This program has already been set in motion with important recent reviews and new sequencing of opsins in arthropod outgroups to the much-sampled Insecta, providing new opsins from crustacean and chelicerates. The gene tree, as overlaid on clade divergence, shows color vision already well established prior to divergence of insects. However incoming data, in the form of a second distinct opsin from the ventral eye of horseshoe crab Limulus (a chelicerate, not crab (malacostracan crustacean)), already requires an earlier origin for the opsin class BcRh1 once thought specific to crustaceans.

Just as absence of effort on hagfish has needlessly delayed our understanding of chordate vision evolution, absence of effort on early diverging Ecdysozoa such as Pycnogonida (sea spiders), Myriapoda (millipedes), Onychophora (velvet worms), and Tardigrades (water bears) has seriously retarded reconstruction of the ancestral opsin repertoire in this lineage. Rather than yet another minor mammalian cone opsin or yet another butterfly gene expansion, biology is better served by sequencing more strategically placed species. Ironically the truly pivotal data may come out of genome projects rather than opsin research per se.

In classifying ecdysozoan opsins from a deeper evolutionary perspective, it is necessary to set aside narrow clade-specific expansions and contractions of opsin repertoires, however adaptively important to the individual species concerned. Wavelengths of peak adsorption -- subject to significant change from tuning residue substitution -- seem an unsound basis for evolutionary classification (though in retrospect work fairly well). This leaves phylogenetic alignment, signature residues and rare genomic events (such as indels and introns) as the main tools.

Here the remarkable observation in 2003 that a single lysine K90 (bovine rhodopsin numbering G90) suffices to define the phylogenetically valid class of ultraviolet opsins. Six years later, despite a vastly expanded data set, there is still near-perfect concordance of spectrophotometry, behavioral studies, alignment clustering, indel signature, intronic structure, and possession of lysine (rarely glutamate) at this position. This residue was previously known to be important to spectral tuning from bird C90S ultraviolet vision and human rhodopsin G90D night blindness.

This residue sits deep within transmembrane helix 2 as illustrated by a massive alignment of this region. That hydrophobic mileau is unworkable energetically for a charged residue unless a compensatory counterion exists. That negatively charged residue is presumably the ancestral counterion, negatively charged E171 (rather than the E113 of vertebrate ciliary opsins) or for positively charged residue, the Schiff base itself. K90, by taking E171 away from the Schiff base lysine K296, has the effect of leaving that protonated, an effect known to shift adsorption into the ultraviolet.

Observe however that opsins specialized to blue (not ultraviolet) are also satisfactorily classified in this same region (June 2009 current alignment below). These opsins have some other residue than lysine at 90 but share a one-residue deletion near the lysine that would shift its orientation relative to the chromophore as well as a proline six residues after the DRY motif, which is glycine in all other ecdyozoan opsins. This agrees with conventional phylogenetic alignment that sisters blue and ultraviolet opsins to the exclusion of long wavelength and blue-green opsins as well as to the more basal BcRh opsins operationally defined by clustering to two particular opsins from the crab Hemigrapsus sanguineus.

While K90 could potentially have arisen multiple times as the same solution to the problem of ultraviolet vision, the simultaneous presence of multiple other defining signatures render this improbable. Opsins with K90 thus date back to the common ancestor of chelicerates and insects (ie Arthopoda) if not earlier, though no such opsins are seen in lophotrochozoan whole genome projects (eg the mollusk Aplysia) or deuterostomes. Blue optimized opsins appear limited to insects. Consequently prior to gene duplication and divergence, the ancestral gene had K90, hence ultraviolet vision not tuned to blue.

This implies coalescence of the blue, ultra-violet and RH7-class ultraviolet opsins to a single gene prior to arthropod divergence. Similarly, despite many lineage-specific expansion, the long and middle wavelength opsins merge into a single gene in this same era. The same can be said for their ciliary opsins and those of deuterostomes. Since lophotrochozoa and deuterostomes possess but a single class of melanopsins -- notably with two introns identical to arthropod RH7 -- it follows that Ur-protosomes and Ur-bilaterans had but two opsins (subject to the caveat that ancestral gene duplication followed by gene loss in all extant lineages is undetectable). The Ur-bilateran did not have color vision but rather ocellar vision and circadian rhythm.

These two Ur-bilateran opsins had not yet coalesced because both classes are observed in earlier diverging cnidaria. However reconstruction of ancestral sequence for ciliary and melanopsin gene familes at this node begins to exhibit a merger. That can be readily be seen using the opsin classifier because best-blastp of cnidarian melanopsin selects cnidarian ciliary opsins. Sponge larva presumably have at least one opsin; in the common ancestor, this may predate the gene duplication leading to the two opsins of eumetazoa. The status of neuropsins, peropsins and RGRopsins is unclear at most of these nodes -- their role may reside more in carotenoid metabolism and retinoid replenishment.

Panarthropoda: Hypsibius (water bear) .. 0 opsins

A 5x genome project for Hypsibius dujardini, a phylum of microscopic ecdysozoan was approved in July 2007 but Broad has not yet begun trace reads on the small 70 mbp genome (suggesting densely spaced genes with small introns as this is not likely highly derived). It could prove very useful for opsins as tardigrades are basal to all of Arthropoda and so shed light on that last common ancester. In fact with accompanying centipede, horseshoe crab, amphipod, and priapulid genomes, the whole ecdysozoan ancester will be accessible.

TardiEyes.jpg

The only known fossil specimens are found in Siberian mid-Cambrian deposits and much later amber. The older fossils have three pairs of legs rather than four, a simplified head morphology, and no posterior head appendages and probably represent a stem group of extant tardigrades. Aysheaia from the Burgess Shale might be related to tardigrades.

Nothing is currently known about photoreception or opsins in tardigrades -- barely that they have eyes. However a rhabdomeric opsin at the minumum may be expected in front of the pigment cups. However the current GSS and EST collections (about 6000 sequences) do not currently contain any convincing matches using various rhabdomeric and ciliary opsins as tblastn queries.

Greven has recently reviewed the situation in regards to tardigrade eyes. These consiste of a pair of inverse pigment-cup ocelli located in the outer lobe of the brain. One (sometimes two microvillous (rhabdomeric) cells are the apparent photoreceptors, which are backed by a single pigment cup cell containing pigment granules (of unknown chemistry) in the outer dorsolateral lobe of the brain. Ciliary sensory cells located close by are probably epidermal mechano- and chemoreceptors rather than photoreceptors.

Phototaxis cannot necessarily be attributed to the ocelli prior to determination of the complete opsin repertoire of the tardigrade genome and its anatomical assignments. It is safe to predict however that the ocellus opsin here will classify as a basal melanopsin. A ciliary opsin, known to be present in tardigrade ancestor, may well be retained. Here the question is whether it is expressed in a ganglion perhaps homologously to those of Platynereis.

Panarthropoda: Onychophora (velvet worm) .. 0 opsins

The key arthropod outgroup Onychophora is also completely lacking in opsin data even though their eyes may provide important clues to the evolution of arthropod rhabdomeric vision -- a pair of simple ocelli at the base of the antennae on the first segment may be the ancestral visual design. The anatomy here consists of a chitinous ball lens, a cornea-like covering and a retina connected to the brain center via an optic nerve. Various Cambrian fossils look more or less like onychophorans, eg Aysheaia, but overall Onychophora do not support a Cambrian explosion.

OnychoEye.jpg

G. Mayer makes the surprising observation that onychophoran eyes are innervated to the central (rather than lateral like ommatidia) part of the brain. More specifically, the posterior branch of the optic nerve connects to the posterior lamina of the central brain whereas the anterior branch, after bifurcating again, joins nerves connecting the antennal glomerulus to the mushroom body. Further, these everse eyes originate embyrologically from an ectodermal groove rather than the lateral proliferation zone of ommatidia which develop from lateral ectoderm of the ocular segment. Consequently ornychophoran eyes are better homologized to median ocelli of euarthropods than to their compound eyes.

Despite some historic confusion over cilia in onychophoran photoreceptors, the photoreceptors reside in microvilli and the ocelli are unambiguously rhabdomeric. However the presence of 9x2+0 cilia raises the question of whether the shift seen in deuterostomes is an abrupt discontinuous difference or less cosmic change just in intracellular targeting of gene expression and membrane ramification.

FossilEyeLobo.jpg

The number of these ocelli varies -- apparently because of lineage-specific structural duplications -- but the ancestral number, inferred to be two from extant lineages, has fossil support if the paired dark spots in the middle of the head in the Lower Cambrian species Luolishania longicruris (synonym: Miraluolishania haikouensis) are its only (lensed ocellar) photoreceptors. In this view, ommatidial eyes did not furnish the primary ancestral vision but are rather a dramatic later expansion of lateral photoreceptors within euarthropods.

This has implications for the opsins used in these respective photoreceptors and the evolution of this gene family. Here it will be important to determine the full repertoire of onychophoran opsins and where each is expressed. The hope here is that a stable association exists of opsin type with photoreceptor types, allowing more to be deduced about photoreception in the ancestral protostome and bilateran.

Recent Lower Cambrian lobopodian fossils from China have clarified the anatomy of these 543 myr old fossils and their phylogenetic relationships to living onychorphorans (which they closely resemble). The paired dark spots interpreted as [non-compound ocellar] eyes are quite small and positioned more dorsally than lateral (which has implication for central rather than lateral innervation). The light environment was bright (shallow marine).

LoboCambr.jpg

While these fossils are probably not in the exact line of descent to any contemporary onycophoran, the last common ancestor is not far removed. They thus suggest that two symmetrically placed ocelli is the ancestral state and that these have a continuous homologous history without confusing gains and losses in photoreceptor structures or major brain re-wiring.

The question is whether these presumed ocelli also gave rise to compound eyes through structural splitting and subsequent specialization in descendent arthropod lineages. Conceivably the major optical system of arthropods evolved later from scratch (though recycling existing components and evo-devo regulatory modules). Another scenario is that these fossils -- and perhaps extant onychophorans -- had additional photoreceptors deployed without telltale pigment cups (making them effectively undetectable, as with ciliary opsins in protostomes). For example lateral photoreceptors providing roll orientation might have evolved into compound eyes with lateral brain connections.

While we will never know the sequence of the opsins utilized in these fossils, they are likely orthologous to the opsins in contemporary onychophorans, recalling the definition of orthology references the LCA and allows for lineage-specific gene duplications. Note however extant velvet worms are not exactly living fossils, being terrestrial animals of dark habitats, a shift accompanied many times by adaptive mutations in opsins. Perhaps methods of ancestral sequence reconstruction can adjust for this in some way.


Chelicerata: Ixodes scapularis (tick) 2 opsins

The genome project was completed long ago but has experienced a multi-year bottleneck in assembly release and publication. However contigs built from a subset of 19.4 million traces became available to tblastn of the GenBank "wgs" division by late 2007. Ixodes has a very conservative genome (regretably 2.1 gbp in size), seemingly far less derived than drosophilids in matters such as intron, gene retention, and protein sequence conservation. This, in conjunction with the helpful phylogenetic position of chelicerate outgroup to the many insect genomes, has improved prospects for reconstructing the ancestal opsin repertoire of Arthropoda and eventually Protostomia and UrBilatera.

A large collection of annotated Ixodes ESTs is available at the DFCI Gene Index of which 3 are marked up (2 wrongly) as opsins. Using the Opsin Classifier, the full length gene could be recovered for the first of these (TC19272) on 24 Nov 07, intronated at the Trace Archives (4 introns, superb coverage), and added to the classifier fasta collection as LMS_ixoSca. It classifies with rhabdomeric opsins (ie with melanopsins) with a very respectable 57% maximal percent protein identity. The second and third intron have classical ancestral position (following GWSR and LAK) and phase (2 and 0). Synteny awaits assembly of large contigs -- adjacent exons are not spanned by single traces.

A fragment from a second melopsin, found in June 2009, has the two exons and best blastp clearly diagnostic of an RH7-type UV opsin but has E in K90 position like some insect blue sensitive opsins. Assembly contigs are very short, ruling out synteny comparison, and coverage is lacking for the first exon. There is no sign of additional UV or blue opsins. No ciliary opsin is present in the current set of traces. Ixodes thus appears to have a small repertoire of opsins.

>UV7_ixoSca Ixodes scapularis (tick) exon 1 missing, exon 2 disjunct, K90 is EIP
0  2 
1 RRRIRSQANLLVFNLALSDLLMVLEIPLLVYNSLKLRPALGVW 1
2 GCQLYGLMGGLSGTSAIFSIAALSLERYLALGRPRDPFARLTRSRAFALSLSSWIYALCFSAWPLLGVTSPYVPEGFLTSCSFHFLSDATSDRCFVWIFFVAAWCVPLVFVTTCYSGILVTVIRSR KALAQES
RRSELRVAKVSLALVLLWTVAWTPYAIVALLGITGRRNLLTPWGSMAPAMFCKSAAVLDPFVYGLSHPSFRRELAIMLPCLRPRQRPVSLTLRAVVQLPKRPGPRSAGSSTSVPVTAPGTTKDNHCPTPPNVSR* 0

>LWS_ixoSca Ixodes scapularis ocellar TC19272 UP|OPSO_LIMPO 
0 MGSEGQRTNMSLLDELASPYMKNGTLVESVPDEMLYMVHPHWYNFKPMNPLWHSLLGFAMVILGVISVVGNSMVIYIMTTSKSLRSPTNMLVVNLAFSDW 2
1 CMMAFMMPTMAANCFAETWILGPFMCEVYGMVGSLFGCGSIWSMVMITLDRYNVIVRGVAAAPLTHKRAALMIFFVWFWALTWTLLPFFGWSR 2
1 YVPEGNMTSCTIDYLTKALWSASYVVAYAGGVYWTPLFINIYCYSKIVRAVAQHEKQLRLQARKMNVASLRANAEQTKTSAEARLAK 0 
0 IALMTVGLWFMAWTPYLTIAWAGIFSDGSKLTPLATIWGSVFAKANACYNPIVYGISHPKYRAALARRFPSLVCMPPGGDQLDTRSEASGITTIEDKVMTTET* 0
The 11 chelicaterate opsins available in June 2009 (after consolidating nuisance GenBank entries for Limulus):

BCR_limPol  Limulus polyphemus  (horseshoe_crab) opsin 5         FJ791252 ventral eye
LMS_limPol  Limulus polyphemus  (horseshoe_crab) CRBOPSINA       L03781   lateral eye and ocellus (5 aa variant)
LMS1_hasAda Hasarius adansoni   (jumping_spider) HaRh1 kumopsin1 AB251846
LMS2_hasAda Hasarius adansoni   (jumping_spider) HaRh2 kumopsin2 AB251847
LMS1_plePay Plexippus paykulli  (jumping_spider) PpRh1 kumopsin1 AB251849
LMS2_plePay Plexippus paykulli  (jumping_spider) PpRh2 kumopsin2 AB251850
LMS_ixoSca  Ixodes scapularis   (tick)                           P35361   ocellus
LMS_loxLae  Loxosceles laeta    (spider) fragment                EY188471 venom gland
UVK_hasAda  Hasarius adansoni   (jumping_spider) HaRh3 kumopsin3 AB251848
UVK_plePay  Plexippus paykulli  (jumping_spider) PpRh3 kumopsin3 AB251851 
UV7_ixoSca  Ixodes scapularis   (tick)                           contigs
PycnoEyes.jpg

The phylogenetic arrangement of these species is (Limulus,(Ixodes,(Loxosceles,(Hasarius,Plexippus)))). Molecular clock dating of divergences (late Paleozoic just for land chelicerates) is under some dispute.

This leaves Pycnogonida (sea spiders) the last major unrepresented chelicaterate group (even more basal if chelifore appendages aren't homologous to true chelicerae, in conflict with Hox expression boundaries showing anterior-most appendages also deutocerebral). Shallow water species have two pairs of dorsally located eyes. Given that the body is generally just a millimeter or two, these eyes are small and quite simple. A longwinded 1891 dissertation on their larval and adult anatomy is available as well as a 1973 ultrastructural and modern account.


Crustacea: Daphnia pulex (water flea) .. 7+ opsins

Opsin daphniaJGI.png

An 8.7x genome assembly was released in July 2007 at JGI with further support at wFleaBase. A May 2009 meeting report suggests an imminent release of initial publications by the 370-member consortium.

The gene count, supposedly 39,000, may be inflated with genomic transcript noise that does not really code for protein, contig assembly errors resulting from polymorphism and use of paired end reads and over-counting of gene fragments and recent processed pseudogenes. JGI and Gnomon models to date err grievously on ciliary opsin gene models because they lack the last exon (below) which is necessary to complete the covalent lysine motif to FR.

This crustacean, basal to Hexapoda arthropods, provides a potentially important outgroup to insects (together forming Pancrustacea). However the opsin story, summarized in a meeting abstract is an embarrassment of riches, not conducive to deducing ancestral arthropod genome content. The total number of opsin genes came in at 37, comprised of 22 rhabdomeric opsins (mostly long wavelength), 7 ciliary opsins (pteropsins), and 8 in a novel family without close affiliates. A post on the Ixodes list serve even raises this to 46 by Feb 2008.

This seems excessive given Daphnia has a single medial compound eye with merely 22 ommatidia with 8 photoreceptors each, an under-focusing lens, and a three-ocellus naupliar eye, yet circadian rhythms and a need to assess water turbidity, depth, and distance fkom shore. Daphnia also can detect polarized light. It's not clear that exquisite color discrimination potentially afforded by dozens of opsins would be advanageous for a 22-pixel array; experimentally, only four wavelengths of peak sensitivity are observed at 348 (UV), 434, 525, and 608 nm in dorsal ommatidia.

Again the possibility arises that K-rhodopsin gene duplicates could have taken on other sensory or metabolic roles (digestion of complex algal carotenoids). Planned in situ hybridization studies may illuminate biological roles of these opsins. The pteropsins are probably of most interest from the urbilateran perspective.

Gene models have not been submitted yet to GenBank but are extractable by text query at wFleaBase. What is needed here however is not the clutter of 37 sequences but their collapse into UV, blue, long, pteropsin, and novel ancestral representatives. This would remove the noise from lineage-specific expansions. The intron structure could provide very important support to classification schemes.

To a certain extent, this has been accomplished by June 2009 at FleaBase as text searching by 'opsin' turns up 25 matches, many in tandem pair sets (which could reflect assembly error to some extent). There is no explanation of how 37 opsins got expanded to 46 then reduced to 25 with ciliary and novel opsins no longer not listed. Despite assigned accessions, no gnomon gene models have been released at NCBI.

DaphniaEye.jpg

Intronated gene models can be manually extracted from scaffold dna (done for four below). These models, taken at face value, unsurprisingly have best-blast at GenBank to Triops and other crustaceans (20 non-Daphnia opsins, all melanopsins), which mercifully have been analyzed in a careful Feb 2009 paper. This study considered only non-EST Branchiopoda (like Daphnia) and Malacostraca melanopsin sequences that likely under-represent opsin evolutionary information available from the full seven classes of Crustacea.

The only Daphnia opsin (NCBI_GNO_472553) with a transcript (FE295533) has been assigned to the BCRH1 group (middle wavelength MWS). One Daphnia opsin has a lysine at position K90 (bovine rod rhodopsin numbering) considered proof of UV purposing.

The value of Daphnia genomic opsins relative to other crustaceans lies in their intronation, which distinguishes expansions arising through retroprocessing from tandem and segmental duplication of a few master intronated genes (which would then be the orthologs to other arthropod opsins).

Indeed the intronation pattern -- typically far more deeply conserved than protein sequence -- could link pteropsins more convincingly to lophotrochoan and deuterostome opsins than alignments with percent identities in the 20's. However, in comparison to Apis opsin counterparts, Daphnia has experienced numerous intron gains and losses, not furnishing a good guide to the ancestral state.

NCBI_GNO_176434 scaffold_53:626704-628972  Blue opsin [probable ortholog of Triops longicaudatus RhC]
NCBI_GNO_416624 scaffold_95:369266-373273  Opsin Rh3 Inner R7 photoreceptor cells opsin
NCBI_GNO_366144 scaffold_14:844292-847788  Melanopsin 
NCBI_GNO_557324 scaffold_2568:2224-6662    Short wavelength-sensitive opsin [defective model fragment but KMAACVDPFVYAINHPKYR]
NCBI_GNO_750363 scaffold_40:707906-709794  Compound eye opsin BCRH1 (brachyuran crab RH1)
NCBI_GNO_754363 scaffold_40:716143-718346  Compound eye opsin BCRH2 (brachyuran crab RH2)
... (rest are BCRH1 and BCRH2 types)

>UVV1_dapPul Daphnia pulex NCBI_GNO_176434 FE384049 EST 53% identical Apis mellifera 69% Triops RhC
0 MLGWNTPEDYMSYVHP 21 YWKTFEAPNPFLLYMIGFLYTIFMFCCVAGNGVVIWIFTN 2
1 CKSLRTPSNMLVVNLAILDMLMMLKSPVMIINSYNEGPIWGKLGCDVFGLMGSYNGIGSAVNNAAIAYDRHR 2
1 TISRPLDGKLSRKQVTLMIVAIWAWATPFSVMPFLGIWGRYVP 1
2 EGFLTTCTFDYMTEDASTRFFVGSIFVYSYVIPLAMLIFYYSKIVRSVGDHEKTLRDQAKKMNVTSLRSNRDQNEKSAEVRIAKVAIALATLFVFAWTPYAFVALTAAFGNR 2
1 SVLTPLLSTVPACCCKLVSCINPWIYAINHPRYR 2
1 MELQKKMPWFCIHEPVPTNDDSSVGSATTEMSGVSKETSS* 0
 
>UVV2_dapPul 49% penultimate intron lost, last intron has slid back 2 aa
0 MNGWNTPADYKSYVHPHWLSYEEPNPMLHHLLGVLYIFFMIASCLGNGIVIYIFST 2
1 TKELKTPSNILILNLAICDFIMMIKTPIFIVNSFNEGPVFGRLGCSIFGLLGAYVGPCSAVTNAAIAYDRYR 2
1 CISDPMGKRWSKSQASLIVLGCWVYASPVSLLPFTEIVNRFVP 1
2 EGYLTSCTFDYMTDNLETKMFVFILWIWCWIMPLGVIIFSYGKITTQVMTHEARLKEQAKKMNVESLRSGANKDARNEIRVAKVGISLTTLFLLSWTPYFAIAFIGCYGNR SLLTPGLSMIPACTCKMAACVDPFVYAINHPK 2
1 YRLELMKRFPWLCVHEKDDSTRSENSTNATIASEAESRT* 0

>BCRa_dapPul Daphnia pulex NCBI_GNO_149114 53% identical MWS_hemSan, 72% Triops longicaudatus RhA AB293433
0 MSNNLSSGYSSVAYRSEGASVLWGYPPGLSIVDLVPDDMKEFIHPHWNKFPPVNPMWHYL 21 LGVIYVILGITSVT 1
2 GNSLVVHLFAKTRDLRTPANMFVINLAFSDLCMMITQFPMFVFNCFNGGVWLFGPLFCELYACTGSIFGLCSICTMAAISYDRYNVIVNGMNRRRMTY 1
2 GRAGGLILFCWIYAIGWSIPPFVGWGKYIPEGILDSCSFDYLTRDTM 0
0 TISFTCCLFAFDYCVPLIIIIFCYYHIVRAIVHHEDALRDQAKKMNVSSLRSNADQKSQSAEIRVAKIAMMNITLWVAAWTPYAAICLQGAVGNQDKITPLVTILPALIAKSASIFNPVVYAISHPKYRL 0
0 ALQKALPWFCIHEKEEKEPPQDRREDSQSIATTNTNSSDVSLP* 0

>MEL1_dapPul Daphnia pulex NCBI_GNO_366144 no close homologs
0 MTSSNDSAGYLWAINATIWIIDDSNETLGIDWDDWDVSLWTQEQRQLLEHGGIPRQVHVALGVLLSFIVLFGFAANSTILYVFSR 2
1 FKRLRTPANVFIINLTICDFLACCLHPLAVYSAFRGRWSFGQT 1
2 GCNWYGMGVAFFGLNSIVTLSAIACERYIVITSSSCRPVVAKWRITRRQAQK 0V
0 VCAGIWLHCAALVSPPLLFGWSSYLPEGVLVTCSWDYTSRTLSNRLYYFYLLFFGFFLPVSVLTFCYAAIFRFILRSSKEITRLIMTSDGTTSFSKSTVSFRKRRRQTDVRTALI
ILSLAILCFTAWTPYTIVSLIGQFGPVDEDGELKLSPMVTSIPAFLAKTAIVFDPLVYGFSSPQFRNSVRQILRQQSISSSGNAGNRAGPNNMAMARTAIQNSRASSHATVSSF
SRNARMFPKDPLSKKTPNDPFVSTPLAVQQIPHFRLPTDVDINEQQFRRGIYANKSVSYWIDIIVLLQLGENLRKSCMKRKNSFKIPAGSIPQKNKLSNSRCSLLEDVSTHSLA
LRQMIFRKEGELYLFHHQPSHNAELAANKMDHQGNNKRIRRRFSEADMMHRSGKCRKNLPVSTSFDQ* 0

Daphnia opsins have no experimental data but their 'best-blast to PubMed' allows inference from opsins with experimental data:
UVV1_dapPul     NCBI_GNO_416624 ... 43% Acyrthosiphon pisum rhodopsin 7 XM_001944891
BCRH1_dapPul    NCBI_GNO_149114 ... 72% Triops longicaudatus AB293433
MEL1_dapPul     NCBI_GNO_366144 ... 33% Patinopecten yessoensis scop1 Gq AB006454
TMTa_dapPul                     ... 36% Apis mellifera pteropsin
TMTb_dapPul                     ... 36% Apis mellifera pteropsin

This blast twilight zone is especially dangerous for photoreceptor opsins because they are embedded in much larger gene family of generic rhodopsin and GPCR which share many structural and signaling properties. A slowly evolving generic rhodopsin might well score higher than fast evolving photoreceptor opsins. Gene expansions are noted for markedly enhanced rates as copies neo- or subfunctionalize. The generic rhodopsin might also share diagnostic residues through convergence at least at the level of statistical signficance ambiguity. Consequently intron location/phase and synteny can provide important backup.

The synteny circle surviving at this phylogenetic depth will be local (optimistically Pancrustacean). That is, the blue opsin of Daphnia might in synteny with Drosophila (ie establish orthology) but not to Platynereis ciliary opsin much less any vertebrate opsin (eg encephalopsin). This could be remedied to some extent by ancestral gene order reconstruction. The degree to which synteny can contribute to validating orthology relations within opsins is not currently known.

Ciliary opsins for Daphnia, absent from the collection of 25 pipeline-labelled genes, can be located by querying with Anopheles counterpart. Stored at the Opsin Classifier as TMT_dapPul, these are plausibly orthologs of deuterostome and lophotrochozoan ciliary opsins, as are new ciliary opsins from Culex, Aedes, Tribolium, and Bombyx. Counterparts to this gene and presumably its associated photoreceptor structure are missing in Drosophila, Nasonia, and other genomes.

In Daphnia, with its high level of apparent tandem duplication and 'excess' of opsins, the opsin of each class with highest external blastp score may be the parental gene and best conserve the function observed in its counterpart in other species.

>TMTa_dapPul Daphnia pulex (water_flea) last exon uncertain 45% id TMT1_anoGam
0 MPVWVYWSASAYLLFISIAGLFMNIVVVVIILNDSQ 0
0 KMTPLNWMLLNLACSDGAIAGFG 2
1 TPISAAAALKFTWPFSHELCVAYAMIMSTA 1
2 GIGSITTLTVLALWRCQHVVWCPTNRNSNFTDPNGRLDRRQGALLLTFIWTYTLIVTCPPLFGWGRYDREAAHIS 2
1 CSVNWESKMDNNRSYILYMFAMGLFIPLMAIFVSYISILLFIHK 0
0 SQQTSNNSDTVEKRVTFMVAVMIGAFLTAWTPYSIMALVETFTGDNVTNDSVSSEIKFYAGTISPAVATVPSLFAKTSAVLNPLIYGLLNTQ 0
0 FRTAWEKFSSRFLGRKKRHQRSQMAMGVSHKRRRDYLRTLLNRPASDEPAIVQHPSTKEMASSQAVSCVVVSNLDVPRAPNNSYVTVNDE* 0

>TMTb_dapPul Daphnia pulex (water_flea) ciliary long tail 60% identity
0 MPTWAYRLTAAYLLLISVLGLIMNVVVVIVILNDSQ 0
0 RMTPLNWMLLNLACSDGAIAGFG 2
1 TPISTAAALEFGWPFSQELCVAYAMIMSTA 0
0 GIGSITTLTALAIWRCQLVVCCPAKRKSAFTNHSGRLGCRQGVILLVIIWIYALAITCPPLFGWGRYDREAAHIs 2
1 CSVNWESKTNNNRSYILYMFCMGLVVPLAVIIISYVRILRVVQK 0
0 NQQQSGNVHRHRRDAAEKRVTMMVACMIAAFMAAWTPYSILALFETFIGQDNHSTYYSSRINNATNFSSAFPDGDLSYVGTISPAFATIPSLFAKTSAVLNPLIYGLLNTQ 0
0 FRLAWERFSLRFLGRFQCHRTQGVSGQHGANHHKTRRNVRKYLPNCYGDSRSLKPTPTVHLPMKEMVVSHAEQKVKTAQEQASSSVTKITTIPLISSDNQTIVSCPSSIMAN
CQQHETNQANHQQAARPDKVVDHQHLLQPNRLSSLLSLSLPSVLISTPNLPCSAQRQSAAEDQAMATCQQMTSGRIRDQQQQSDSFVVVGLLSRSADCYHHHTGDVEQFVFLDSTVDELGLTARSASP* 0


Hexapoda: Tribolium castaneum (flour beetle) .. 3 opsins

TriCasEyes.png

The red flour beetle, which is highly dark-adapted in lifestyle, has lost its blue opsin but not ultraviolet according to both the newly published genome project and specialized experimental querying, retaining the long wavelength ancestral color vision opsins and ciliary opsin (which is called pteropsin in insects though likely a strict ortholog of vertebrate TMT). The Tribolium genome article 110 page supplemental contains an excellent Table S14 of all known genes involved in insect eye development.

The fellow orthopteran, the corn rootborer Diabrotica, furnishes an ultraviolet opsin

Insect opsins are expressed non-uniformly across individual eye units (ommatidia) within compound eyes. In Drosophila, six peripheral photoreceptor cells R1-R6 express LW opsin which detect brightness, projecting into the upper optic neuropil (lamina). Central photoreceptors R7 and R8 provide color vision via UV, blue, and LW opsins that project into the second (medulla). The dorsal rim area ommatidia are modified to detect polarized light.

The comparative genomics of ommatidia number and opsin utilization is indicated in the figure. Opsin gene loss raises different issues, namely replacement, from the more familiar gene gain issues (differential rewiring). After discussing various sequential mutational scenarios and the necessity of each step being adaptive or at least near-neutral, Jackowska et al settle upon expansion of LW opsin expression into all photoreceptor cells, resulting in co- expression with blue opsin in some R8 cells and UV-opsin in R7cells. This is followed by loss of expression or pseudogenization of blue opsin. Although co-expression defeats the purpose (via spectral summation) of separate opsins that enable color vision, there are precedents in butterflies and (typically nocturnal) vertebrates.

Opsins tribolium.png

It's also known how Apis and Manduca (also genome project species) end up with nine photoreceptor cells per ommatium instead of eight -- it's due to duplication of R7 cell fate (across all ommatidia). That raises the interesting question of whether such cell duplication simply results in duplication of opsin expression at the molecular level. That's not the quite the case today because the two central R7-like cells exhibit differential opsin expression. It's not known whether additional mutations were needed to attain this.

In summary, insect genomes are fairly straightforward in terms of their contribution to establishing the ancestral arthropod visual system, but their real value lies in the extensive comparative data available within Insecta, ecological studies of adaptive vision, and the experimental genetic opportunities within Drosphila (eg a recent article exploring deviations from ommatidia expressing but a single opsin). However no single insect genome can serve all purposes because of gene loss (eg ciliary opsins in Drosophila).

That's also the case for non-opsin GPCR which have gained a new importance given the possibly paraphyly of the opsin gene tree (ie some opsin gene duplicates may have given up retinal to signal via other agonists). Here we are fortunate to have a genome-wide inventory of neurohormone GPCRs in Tribolium. This turns up 20 biogenic amine GPCR (21 in Drosophila, 19 in bee), 48 neuropeptide GPCR (45 in Drosophila,35 in honey bee), and 4 protein hormone GPCRs (4 in Drosophila, 2 in bee) with likely ligands for 45 of the 72 Tribolium GPCR. The flour beetle retains an ancestral vasopressin GPCR and cognate peptide unlike other studied insects which are not adapted to such an extremely dry environment. On the other hand, Tribolium lacks allatostatin-A, kinin, and corazonin. This covers comparative genomics of 340 million years of insect GPCR evolution -- it is very common for new agonist/receptor couples to arise and old ones to disappear. Again we see genome density sampling will need to be high to sort out Urbilatera.

>UV5_triCas Tribolium castaneum (flour_beetle) 
0 MYVVHPFKIIRNKVTILRTMETMANHLGWNVPKDELIHIPQHWLVYPEPEASMHFLLALIYIGFFIMATIGNGLVIWIFST 2
1 SKSLRTASNMFVVNLAICDFAMMIKTPIFIYNSFYRGFALGHLGCQIFAFIGSLSGIGAGMTNACIAYDRYT TITRPFDGKITRTKALVMIIFVWGYTIPWAVMPLLEIWGRFAP 1
2 EGFLTACSFDYLTDTFDNHMFVTSIFICSYVIPMSMIIYFYSQIVSKVFSHEKALREQ 0
0 AKKMNVESLRSNQSQQASQSAELRIAKAAIAICSLFVASWTPYAVLALIGAFGDQSLLTPGVTMVPACACKFVACLDPYVYAISHPKYR 2
1 LELQKRLPWLAIKETAASETQSTTTENTTTQSATTTT* 0

>LWS_triCas Tribolium castaneum (red flour beetle) ES544655 3 exons from AAJJ01000967 5 fusion relative to bee
0 MSVMGEPNFIAWAAQRSGYGGGNLTVVDKVLPDMLHLVDAHWYQFPPMNPLWHGILGFVIGVLGFVSIVGNGMVIYIFSSTKALRTPSNLL
VVNLAFSDFLMMlCMSPAMVINCYNETWVLGPLVCELYGMSGSLFGCASIWTMTFIALDRYNVIVKGLSAQPLTKKGAMLRILIIWVFSTLW
TIAPFFGWNRYVPEGNMTACGTDYLTKDWVSRSYILVYAVWVYFVPLFTIIYSYWFIVQ 0
0 AVAAHEKSMREQAKKMNVASLRSSEAAQTSAECKLAKIALMTITLWFFAWTPYLVTNFTGIFEGAKISPLATIWCSLFAKANAVYNPIVYGIS 2
1 HPKYRQALQKKFPSLVCAGEPDDTTSTASGVTNVTTDEKPATA* 0

>TMT_triCas Tribolium castaneum (60%)55 298 encephalopsin-class ciliary
0 MKNFNSTEIGDELLIPVEGYIAAAVVLFCIGFFGFSLNLTVIIFMLKERQ 0
0 LWSPLNIILFNLVVSDFLVSVLGNPWTFFSAINYGWIFGETGCTIYGFIMSLL 1
2 SITSITTLTVLAFERYLLIARPFRNNALNFHSAALSVFSIWLYSLSLTIPPLIGWGEYVHEAANLS 2
1 CSVNWEEKSPNSTSYILYLFAFGLFLPLVIITFSYVNIILTMRR 0
0 NAAFRVGQVSKAENKVAYMIFIMIIAFLTAWSPYAIMALIVQFGDAALVTPGMAVIPALLAKSSICYNPVIYIGLNAQVKGAKWVSGLIYLFQFQQAWMQKWKKNRR
GSDALGTSRVMLETIHQACRDEKTDKLLEKKTKFCKDFETDVSML* 0


Hexapoda: Pediculus humanus (louse) .. 3 opsins

Opsin louse.png

The body louse genome, being favorably small at 108 Mbp, is well along with 2.2 million traces and a contig assembly hopefully disentangled from its endosymbiont bacterium. Sequencing is medically motivated. The lifestyle of this hemimetabolous (nymph-like adult, no pupal stage) insect does not suggest a full spectrum of metazoan photoreceptors; indeed we shall find but 3 opsins. Even that seems a lot for a single lateral ocellus of 130 rhabdomeric photoreceptor cells lacking Semper and dedicated pigment cells. The broader interest here is intronation and synteny of these opsins (hence orthology), not available in many insects with opsin studies. It requires quite dense sampling to get ancestral introns for each arthropod opsin class because high rates of intron gain and loss can occur.

I reconstructed 3 multi-exon louse opsin genes on 24 Dec 07 by tblastn of numerous queries against GenBank wgs database division. These apparent rhabdomeric imaging opsins are stored in the Opsin Classifier as INSE_LWS_pedHum, INSE_UVV1_pedHum, and INSE_UVV2_pedHum. Louse otherwise seems a gene loss story in terms of relic ciliary opsins or even melanopsins so not especially favorable for retention of ancestral characters. The new opsins potentially provide trichromatic color vision to the louse in the short, blue, and long wavelength photoreception regimes, though lambda max awaits experimentation as the second ultraviolet opsin could be either re-tuned or co-opted for some other function, as in bumblebee where a UV opsin is expressed in proximal lamina rim, antennal lobe, central complex and protocerebrum clusters. That seems likely because INSE_UVV2_pedHum is back to ancestral tyrosine in (bovine rhodopsin) position E113 whereas true ultraviolet insect opsins all specify phenylalanine here (which relaxes lambda max into the ultraviolet, ie closer to that of free retinal).

CA Hill of the louse genome annotation team discussed 3 opsins back in a June 2007 email session, calling PHUM001073 perhaps an ultraviolet opsin while rejecting a fourth PHUM000074. These gene models are not released to GenBank nor is that terminology used in the meagre search capabilities of P. humanus VectorBase. Upon whole proteome file download, PHUM001073-RA turns out to be an unintronated dna fragment matching residue 44 to stop codon of INSE_UVV1_pedHum. PHUM000074-RA has nothing to do with opsins. PHUM005795-RA is missing the first 49 residues of INSE_LWS_pedHum but otherwise identical. PHUM001044-RA is a fragment beginning at residue 55 of INSE_UVV2_pedHum. In short, it's hard to find full length genes without benefit of the Opsin Classifier, cdna, or ab initio gene predictor.

Hexapoda: Rhodnius prolixus (kissing_bug) .. 4 opsins

Yet another genome project completed long ago at the trace level but sitting around unassembled until 17 June 2009 (tblastn now at GenBank wgs). In August 2008 some 6,879,098 trace reads and 16,284 EST sequences were available. This number of traces is more than adequate for a good assembly but until now, opsins had to be fished out by exon by exon using blastn of trace archives.

Rhodnius prolixus, a large blood-sucking hemipteran insect that is carrier for a parasitic protozoan (Trypanosoma cruzi) responsible for Chagas disease through bites around the eyes and mouth. Chagas disease is a currently incurable tropical disease that damages the heart and nervous system. Rhodnius is nocturnal, with possible implications for its opsin repertoire, but becomes active at night. It is found in South and Central America, primarily in domesticated rural areas, currently affecting 16-18 million people and killing around 20,000 people annually. Darwin is sometimes claimed to have suffered from Chagas disease as a result of a bite (implausibly in northern Argentina) reported in Voyage of the Beagle diaries.

Rhodnius clearly has three distinct melanopsins and a ciliary pteropsin. One is a long wavelength sensitive gene most closely related (84% identity) to Tribolium but whose intronation pattern is closest to Apis (a phase 00 intron is missing in Rhodnius). The other two Rhodnius melanopsins have K90 so adsorption in the UV. The ciliary opsin is closest to that of mosquito and flour beetle but quite diverged at 56% identity.

>UV7_rhoPro Rhodnius prolixus (kissing_bug) Pterygota K90 at KMP, ortholog RH7 of droMel
0 mKYFHLYPIEQWKMHRFFTEEYLKLVNTHWFEYPPPNKQIHYIFAAVYFLVMLVGVSGNLLVIFMILR 2 
1 FRTLRTSSNILILNLAVSDFLMVAKMPVFIYNSFYFGPVLGEM 1
2 GCHFYGFIGGLSGTASILTLAAIAMDRYLGIAHPLNFNQGRAKKRTIVWITFIWVYSITFASIPLSHIGVKTYVPEGFLTSCSFDYLSTDIQNRCFIFIYFVAAWCLPLLVIITSYVGICREVLRVSLIRKGQE
REQRKREAKLSAILALATFLWFLSWTPYAAVALLGIFGYKNHITQLASMIPALFCKTAACVNPFIYGLNHPRLRQQLLKLCCKKRYNLEKTHFSRSWRNTSCSFKLKEQSLCNVSQSRLRRTSTVASEPSEHSTHFM* 0

>UV5_rhoPro Rhodnius prolixus (kissing_bug) exon 1 missing, K90 at KTP
0 0
1 ASTSGNIRTLGWNLSPEDLKHIPEHWLSYPEPEPILNYALGVLYIFFMLIALIGNGLVIWIFST 2
1 AKTLRTPSNIFVVNLAICDFLMMSKTPIFIYNSFKLGYALGHRACQIFALLGSFSGIGASATNAVIAYDRYR 2
1 VIATPFAPKLSRTKAVLYLALVWAYVTPWALLPLFEQWSRFVP 1
2 EGFLTSCTFDYLTPTSEIRNFVTVMFFICYVFPMSLIIYFYSQIVSHVIIHEHNLREQ 0
0 AKKMNVESLRSNANMHTQSAEIRIAKAAITICFLFVASWTPYAVLALIGAYGNQ 2
1 DLLTPAVTMIPACACKAVACVDPYVYAISHPRYR 2
1 QELSKKFPWLDIKEAPAPSSVDANSTATEMTLPTQTSPAEA* 0

>LWS_rhoPro Rhodnius prolixus (kissing_bug) 
0 MAQPIGPSFAAYQWGQSANPSANRSVVDMVPPEMLSMVDAHW 2
1 YQFPPLNPLWHGILGFVIGVLGIISIVGNGMVIFIFSSTKTLRTPSNLLVVNLAFSDFLMMFTMSPPMVINCYNETWVL 1
2 GPLMCELYGMLGSLFGCASIWTMTMIALDRYNVIVK 0
0 GISAKPMTNKTAMLRILLVWAFSIMWTVFPFFGWNR 2
1 YVPEGNMTACGTDYLTKNWVSRSYILVYSVFVYFLPLFTIIYSYFFILQ 0
0 AVSAHEKQMREQAKKMNVASLRSAEAANTSAEAKLAK VALMTISLWFMAWTPYLVINYSGIFETISISPLFTIWGSLFAKANAVYNPIVYAIR 2
1 HPKYKQALEKKFPSLSCASPQDDTTSVATGVTTSTDDKAPSA *0

>TMT_rhoPro Rhodnius prolixus (kissing_bug) Insecta; Pterygota ciliary opsin full ACPB01038514 + ACPB01038515 56% TMT_triCas
0 MLMPSAGFLAASIILFLIGFLGFFGNLIVIIIMCRDKN 0
0 LWTPVNFILFNVIVSDFSVAALGNPFTLASAIAKRWFFGQSMCVAYGFFMALL 1
2 GITSINSLTVLALERYLIVSQPVSHGSLSRPTASDIVGSIWLYSFVITiPPLVGWGEYGLEAANIS 2
1 CSINWETRSHSSTSYILFLFTFGFFIPIIVISYSYMNIILTMKK 0
0 STMNAGRVNKAESRVTWMIFVMIFAFFLAWTPYAILALMIAFFDSNVSPAIATIPAIFAKTSICYNPFIYAGLNTQVVYFFV* 0 

Hexapoda: Acyrthosiphon pisum (pea_aphid) .. 6 opsins

The first draft of aphid genome Acyr_1.0 was released in June 2008 though no publication has yet appeared. The contigs are now available at GenBank in wgs. Coding gene annotation is low quality, with 11 gene models labelled 'opsins' of which only 6 are valid.

The opsin repertoire of Acyrthosiphon is surprising. First it does not reflect any gene loss because ciliary, long wavelength, blue and ultraviolet opsins are all represented. The latter classes of opsins are expanded into two gene pairs. Contigs are so small that it is not possible to say whether these are tandem. One gene of the four has lost K90 to valine and presumably lacks the associated shift to UV in peak adsorption.

The first pair has 8 exons, the second 3, suggesting (along with lowish percent identity) substantial time since duplication and divergence. The second pair, called UVV2a/b below, has lost the HEK motif of the third cytoplasmic loop, raising issues about retention of Gq as signalling partner.

Five lines of evidence suggest this second pair corresponds to RH7 in Drosophila:

  • RH7 are best-blastp match at nr and wgs to aphid query, though percent identity is low at 43%
  • large deletion in CL3 causes loss of HEK, though residual residues do not align with CL3 of drosophila
  • distinctive match in EL2 of ALDIGLSV region of RH7 to VLDLGYS in aphid including 1 extra residue
  • distinctive length and similar motif past DRY motif at boundary of TM4 and CL2
  • shares unique 3 exon structure and identical intron location and phases (21 12)

RH7acyPis.jpg

Odd phylogenetic distribution of RH7 within insects:
+ Insecta Dicondylia Pterygota Neoptera Paraneoptera  Hemiptera   Acyrthosiphon
- Insecta Dicondylia Pterygota Neoptera Paraneoptera  Hemiptera   Rhodnius

+ Insecta Dicondylia Pterygota Neoptera Endopterygota Diptera     Drosophila
- Insecta Dicondylia Pterygota Neoptera Endopterygota Diptera     Aedes
- Insecta Dicondylia Pterygota Neoptera Endopterygota Hymenoptera Apis
- Insecta Dicondylia Pterygota Neoptera Endopterygota Hymenoptera Nasonia
- Insecta Dicondylia Pterygota Neoptera Endopterygota Coleoptera  Tribolium 
>TMT_acyPis Acyrthosiphon pisum (pea_aphid) XM_001952259 ciliary opsin 53% TMT_aedAeg
0 MDEETSKGVLT 0
0 LWTPQNVIIFNLATSDLAVSVLGNPVTLAAAITKGWIFGQTICVIYGFFMALF 1
2 GIASITTLTVLAYDRYLMIRYPFSSSRLTKETALYAIAGIWIYAFAVTGPPLFGWNRYVNESANIS 2
1 CSIDWESGEHSNYVIYIFVFGLFLPVTVIIYSYVSLVVTVRK 0
0 RAAEKIIGQATKAECRVAIMVAVMILAFLTAWMPYSVLALMIAFGGVHISPVVSIIPALCAKSSICWNPIIYIGLNTQ 0
0 FRSAWKRFLNIQDTLSEVSLDADITTGMTKLMTGHQELPAHPMNNGDASHPPGLIMCCLAHDEHRQSATYADRYECNLEMKSCNPQTLGRRPETDIGDVSL* 0 

>INSE_LWS_acyPis Acyrthosiphon pisum (pea_aphid) SCAFFOLD6053:23617,25535 67% LWS_pedHum
0 MLNKIGSHYERQENWVAEGGFGNETVVDRVPADMMHLIDPSW 2
1 YQFPPMESMWYKWLGVTIFFLGILSVVGNGMVIYIFTCTKNLRTPSNLLIVNLAFSDFCLMFTMCPAMVWNCFYETWMF 1
2 GPFACELYAMFGSLFGVTSIWTMVFIALDRYNVIVK 0
0 GLSAKPMTTKLALLQIFCIYLHGLFWTLTPFFGWSR 2
1 YVPEANMTACGTDYLTLAWHSRSYVLVYAIFAYYLPLLVIIYAYYFIVK 0
0 AVASHEKSMREQAKKMNVSSLRSGDQSNTSAEFKLAKVALMTISLWFMAWTPYMVINFAGIFQLMTIDPLFTIWGSVFAKANAVYNPIVYAIS 2
1 HPKYRLALDKKFPCLVCGKLEDDRSDSKSVASAQTTISEDKV* 0

>INSE_UVVa_acyPis Acyrthosiphon pisum (pea_aphid) 8 exons SCAFFOLD14509:21417-33525 62% UVV_apiMel V in K90
0 MDFNRSVSRPLSQLGS 2
1 SFMENEEELQLMGWNLTPEDLTHIPEHWLSYPEVRSLYHYILAFSYTILFCLGVIGNGLVLWIFCV 0
0 SKPLRTPSNLFVLNLALCDFSMVLVLPILIYDSIDHKYP GHLQCQIFALCGSISGIGAGATNAAIAYDRYS 2
1 TIAKPFEGRMTYGKALILIICIWIYVLPWCLLPLTEKWNRFVP 1
2 EGFLTSCSFDYLTPTEETKAFVGTMFVICYVIPMSFIIYFYSQIVCHVFNHEKALREQ 0
2 AKKMNVESLRSNQDANAQSAEVRIAKAAITICFLFVAAWTPYAVVAMIGAFGDQ 2
1 SLLTPIASMLPAVFAKTVACFDPYVYAISHPKYR 2
1 LELSKRVPCLGITEKPLATSDTQSITTAA* 0

>INSE_UVVb_acyPis Acyrthosiphon pisum (pea_aphid) 8 exons SCAFFOLD14509:41790,53815 76% identical UVVa_acyPis K in in K90
0 MDFNRTVSRPLAQLGs 2
1 SLMENEVGETHLLGWNLQAEDLIHIPEHWLKYQEPSSLQHYYLAFMYTIFMFVALFGNGLVIWVFCV 0
0 AKPLRTPSNIFVINLALCDFVMMAKAPIFILGSINRGYQ GHFLCQLFGTAGAFSGIGASATNAAIAYDRFS 2
1 TIAKPFDGRMTYGRAFFLIICIWTYTLPWGLLPLTEKWNRYVP 1
2 EGYLTSCTFDYLSPTDETRAFVGIMFVICYVIPVSLVIFFYSQIVSHVFNHEKALREQ0
0 AKKMNVESLRSNQDANAQSAEVRIAKAAITICCLFIASWTPYAVVAMIGAFGDR 2
1 SLLTPGITMIPAIFCKTVACFDPYVYAISHPRYR 2
1 LELSKRVPCLGISEKPPPTASETQSTTTAA* 0
  
>INSE_UVV2a_acyPis Acyrthosiphon pisum (pea_aphid) 3 exons SCAFFOLD4798:3246-5335 altered HEK CL3 52% UVV2_pedHum K in in K90
0 MIDFKTKYPVNLWKDHGLYTDDYIKLINSHWLKFMPPNPTSHYVLGLLYTVIMVFGCTGNSLVIFMYFK 2
1 CRSLQTPANMLIINLAVSDFIMLAKASVFIYNSYYLGPALGKL 1
2 GCQVCGFLGGLTGTVSIMTLAAISLDRYYVIVCPLKAAVKTTKQRARIWIGLIWIYGFSFSIVPVLDLGYSRYVSEGYLTSCSFDYLSDNDQDKRFI
LVFFTAAWCIPFTIILYCYVNILMAVWMTTEIVTSRVGQQEEKRKTDIRLGYMVIGALALWFVSWTPYAVVALLGVFDLKEYISPLSSMIPALFCKAAS
CTDPWFYAITHPRFKKELMKLLTKSKSRKLVRNYGMKKGWVGSHLNKNGSVDFDNCLKTEYKEENTTIFMLESDDNNLHCQGSTSGHKTESTKEPETKFTASASQETLKYMLPS* 0

>INSE_UVV2b_acyPis Acyrthosiphon pisum (pea_aphid) SCAFFOLD14504:180756-183351 72% UVV2a_acyPis altered HEK CL3 K in in K90
0 MSDFKTKYPIDTWKEHGFYTDDYMKLINSHWFKFMPPNATSHYILGFLYSVIMVLGCFGNSLVIFMYIK 2
1 CKSLQTPANVLIMNLAVSDFIMLAKTPVFIYNSFYQGPTLGKL 1
2 GCQIYGFFGGLTGTVSIMTLAAISLDRYYVIVHPLNAAVKTTKQRARVWIGLIWIYGFLFSIIPVMDLGYNRYVPEGYLTSCSFDYLSDDNQEKGFILVFFTAAWCIPFTTISYCYIKI
LRAVWMTSEMAASRFGQEEEKRKTEIRLGYVVVGVIMLWFVSWTPYAMVALLGVFDRKDYITPLSSMIPAVLCKAASCMDPWIYAITHPRFKNELTKLMSRKKTRKLERDYGMKKNWGGQ
SYSNKSGAGLRNLSSSEDECVEEVIVVIDPDDKKMKRQGSTSSHKTEETKALETKFPPTRQESLKYMPPSWYKLPRTTSKSSIMLDPKLTGDDNNK* 0 

Hexapoda: Drosophila melanogaster (fruitfly) .. 7 opsins

DroPhylo.jpg

Every aspect of photoreception in Drosophila has been studied for decades. Because this research is regularly reviewed at length, the focus here is on genome project developments and issues that remain in characterizing opsin function and evolution.

Drosophila has seven opsins, all of melanopsin class. Ciliary-class opsins (present elsewhere in arthropods) have been lost in all 12 drosophilid genomes, as have the peropsins classes (which persist in deuterostomes and some lophotrochozoa perhaps because without ciliary opsins there is no need for a retinal isomerase regeneration cycle). However it raises the question whether some neuroanatomical structures have also been lost. The comparable ciliary opsin in bee is expressed somewhere in the brain but not in simple or compound eyes -- unfortunately it is not known whether anatomical expression is like that of Platynereis nor whether drosophila lacks this structure.

The paired Drosophila retinas have 850 ommatidia each housing eight photoreceptors of three types; the paired cephalopharyngeal Bolwig organs have 12 photoreceptor cells of two types. Oddly, during metamorphosis to eyelet, the outer Bolwig cells die while inner cells switch gene expression from Rh6 to RH5.

Two of the Drosophila opsins have peak sensitivity in the ultraviolet (RH5 RH7) consistent with their K90 lysine and shorter CL3 loop motif, two sister opsins peak (RH3 RH4) in the blue and the rest (R6,(RH1,RH2)) at longer visible wavelengths. Opsins have been assigned to the four known photoreceptor structures as follows:

  • RH5 RH6 Bolwig organ (larva) in founder and periferal cells, resp.
  • RH6 Hofbauer-Buchner eyelet (adult founder cell remnants of Bolwig organ)
  • RH2 ocellus (adult)
  • RH1 R1-R6 periferal photoreceptors of ommatidia (adult eye)
  • RH3 RH4 R7 photoreceptor of ommatidia (adult eye)
  • RH5 RH6 R8 photoreceptor of ommatidia (adult eye)
  • RH3 dorsal R7 R8 polaralization receptors (adult eye)

Note RH7 is missing from the list. This orphan opsin has no tissue-labelled transcripts at GenBank as of June 2009. It does not occur in any of the known photoreceptors, suggesting the repertoire of adult brain ultrastructures is still incomplete. Some authors have questioned whether RH7 is a 'real' opsin (because the third cytoplasmic loop CL3 is non-standard).

However it still retains the DRY motif, the Schiff base lysine and many other characteristic residues and opsin motifs. Its peak sensitivity would lie in the UV because of the well-conserved K90 motif, which is conserved in all 12 drosophilid genomes. The upstream PAX6 promoter RCSI site still matches the consensus sequence, TAATYCGATTA even though the first coding exon is anomalously lengthened and very prone to internal indels.

DroRhos.jpg

RH7 has three exons versus five in bee UV and eight in bee blue opsins. The first intron in RH7 VIFMYFK 21 CRSLQTP is identical in location and reading phase 21 to an intron in conventional UV opsins. This provides strong independent support to Blast clustering for a shared common ancestry of these opsin classes because a 300 residue protein has 3 possible phases (thus 900 possible introns). This common intron also suggests a tandem or segmental duplication history relating these three genes followed by intron loss, rather than retropositioning followed by intron gain.

The intronation of RH7 within Arthopods has been stable back to chelicerates (though the gene itself has been lost in many lineages and Drosophila itself has retained only the second). Astonishingly, Lophotrochozoan melanopsins also have the identical intron pattern of RH7 (determinable from Lottia, Aplysia, Helobdella, Schmidtea, Schistosoma genome projects) as do vertebrate melanopsins (for example Gallus) proving both introns of RH7 ancestral to the Ur-bilateran. None of these latter opsins have ultraviolet K90; indeed some are non-imaging. The only known cnidarian melanopsin, from coral, is a transcript.

RH7 VIFMYFK 21 CRSLQTP Acyrthosiphon
UV5 VIWIFCA 21 AKSLRTP Apis
UVB VIWIFST 21 SKSLRTP Apis
MEL VIYTFSR 21 TKSLRTA Lottia
MEL VIYAFCR 21 SRTLQKP Gallus

Other arthropod melanopsins also have unusual cytoplasmic third loops, which has predictive implications for Galpha signalling partner. This Galpha web tool allows studying the effects of replacing cytoplasmic loops or tail of RH7 with those of its nearest match, the UV-tuned RH5. This would not affect transmembrane structure or extracellular loops but might alter coevolved relations on the cytoplasmic face.

RH7 is exceedingly conserved (except in the amino terminus) in the other 11 drosophilids with sequenced genomes, ruling out both processed and unprocessed pseudogenes. Its two introns bear no relation in position and phase to those in any other drosophila opsins. The carboxy terminus is surprisingly conserved despite earlier indels. Remarkably for such a conserved gene, it is quite isolated phylogenetically. Only aphid provides a potential ortholog candidate. This cannot plausibly reflect horizonal gene transfer (from what animal?) but cannot reflect an ancient gene duplication either, short of invoking many lineage-specific gene losses.

Consequently the first order of business is to work up the species tree with targeted sequencing to pinpoint the evolutionary origin of RH7 -- Insecta; Dicondylia; Pterygota; Neoptera; Endopterygota; Diptera; Brachycera; Muscomorpha; Eremoneura; Cyclorrhapha; Schizophora; Acalyptratae; Ephydroidea; Drosophilidae; Drosophila melanogaster group. Because the gene is missing in dipteran and coleopteran genomes (mosquitoes, bee, flour beetle), that search can be be restricted. It seems too diverged from other opsins to have originated just in a few tens of millions of years of evolution represented by drosophilids (but perhaps not too much in terms of generations).

Second, it should be noted that a 2002 whole-proteome quantitative transcription project did in fact uncover RH7 transcripts (as displayed at the UCSC GeneSorter). Here peak expression, as normalized to egg-to-adult total RH7 transcripts, occured in 76-hour mesomorphs. Total expression was highest in 5-day adult females. Improved all-gene experiments in 2008-09 ruled out RH7 expression in pupae but verified expression in adult male and female heads at equal levels. These transcripts are not yet correlated with any anatomical structure. Despite arrays of the full set of 13,000 coding genes, a Drosophila brain expression atlas has never gotten off the ground -- each gene must be inefficiently studied in a one-off manner.

Gene, name, coding exons, introns present, chr location:
 RH1 (CG4550-RA)   5  chr3R 15,712,948 shares two introns with RH6 and one with RH2, similar SKA* termini
 RH2 (CG16740-RA)  4  chr3R 14,725,942
 RH6 (CG5192-RB)   3  chr3R 11,309,650 
 RH3 (CG10888-RA)  1  chr3R 15,907,472 possible retrogene of RH5
 RH4 (CG9668-RA)   2  chr3L 16,850,872 possible retrogene of RH5 with later intercolated genes
 RH5 (CG5279-RA)   3  chr2L 12,009,111 two ancestral introns (also Apis, Daphnia; first also Aplysia, Platynereis and Homo)
 RH7 (CG5638-RA)   3  chr3L 12,162,941 two novel introns, anomalous first exon

RH7 appears not involved in Drosophila circadian photoreception systems, which are mediated by the blue sensitive pterin-flavoprotein cryptochrome CRY (not homologous to opsins) in clock neurons and by opsins RH1, RH5 and RH6 in photoreceptors.

Curiously CRY is also implicated in magnetic field perception based anisotropic hyperfine coupling between unpaired electron and nuclear spins ([1, 2, 3). RH7 is not plausibly involved in this either because it is new whereas magnetosensing is old and widespread. In eery analogy to ciliary opsins, drosophilids but not butterflies have lost the close paralog to mammalian CRY.

>RH1_droMel Drosophila melanogaster (fruitfly) CG4550-RA
0 ME 00 SFAVAAAQLGPHFAPLSNGSVVDKVTPDMAHLISPYWNQFPAMDPIWAKILTAYMIMIGMISWCGNGVVIYIFATTKSLRTPANLLVINLAISDFGIMITNTPMMGINLYFETWVLGPMMCDIYAGLGSAFGCSSIWSMCMISLDRYQVIVKGMAGRPMTIPLALGKIAYIWFMSSIWCLAPAFGWSR 2
1 YVPEGNLTSCGIDYLERDWNPRSYLIFYSIFVYYIPLFLICYSYWFIIA 0
0 AVSAHEKAMREQAKKMNVKSLRSSEDAEKSAEGKLAKVALVTITLWFMAWTPYLVINCMGLFKFEGLTPLNTIWGACFAKSAACYNPIVYGIS 2
1 HPKYRLALKEKCPCCVFGKVDDGKSSDAQSQATASEAESKA* 0

>RH6_droMel Drosophila melanogaster (fruitfly) CG5192-RB gross genomic misassembly exon1
0 MASLHPPSFAYMRDGRNLSLAESVPAEIMHMVDPYWYQWPPLEPMWFGIIGFVIAILGTMSLAGNFIVMYIFTSSKGLRTPSNMFVVNLAFSDFMMMFTMFPPVVLNGFYGTWIMGPFLCELYGMFGSLFGCVSIWSMTLIAYDRYCVIVKGMARKPLTATAAVLRLMVVWTICGAWALM
PLFGWNRYVPEGNMTACGTDYFAKDWWNRSYIIVYSLWVYLTPLLTIIFSYWHIMK 0
0 AVAAHEKAMREQAKKMNVASLRNSEADKSKAIEIKLAKVALTTISLWFFAWTPYTIINYAGIFESMHLSPLSTICGSVFAKANAVCNPIVYGLS 2
1 HPKYKQVLREKMPCLACGKDDLTSDSRTQATAEISESQA* 0

>RH2_droMel Drosophila melanogaster (fruitfly) CG16740-RA
0 MERSHLPETPFDLAHSGPRFQAQSSGNGSVLDN 0
0 VLPDMAHLVNPYWSRFAPMDPMMSKILGLFTLAIMIISCCGNGVVVYIFGGTKSLRTPANLLVLNLAFSDFCMMASQSPVMIINFYYETWVLGPLWCDIYAGCGSLFGCVSIWSMCMIAFDRYNVIVKGINGTPMTIKTSIMKILFIWMMA
VFWTVMPLIGWSAYVPEGNLTACSIDYMTRMWNPRSYLITYSLFVYYTPLFLICYSYWFIIAAVAAHEKAMREQAKKMNVKSLRSSEDCDKSAEGKLAKVALTTISLWFMAWTPYLVICYFGLFKIDGLTPLTTIWGATFAKTSAVYNPIVYGIS 2
1 HPKYRIVLKEK 00 CPMCVFGNTDEPKPDAPASDTETTSEADSKA* 0

>RH3_droMel Drosophila melanogaster (fruitfly) CG10888-RA single exon
0 MESGNVSSSLFGNVSTALRPEARLSAETRLLGWNVPPEELRHIPEHWLTYPEPPESMNYLLGTLYIFFTLMSMLGNGLVIWVFSAAKSLRTPSNILVINLAFCDFMMMVKTPIFIYNSFH
QGYALGHLGCQIFGIIGSYTGIAAGATNAFIAYDRFNVITRPMEGKMTHGKAIAMIIFIYMYATPWVVACYTETWGRFVPEGYLTSCTFDYLTDNFDTRLFVACIFFFSFVCPTTMITYY
YSQIVGHVFSHEKALRDQAKKMNVESLRSNVDKNKETAEIRIAKAAITICFLFFCSWTPYGVMSLIGAFGDKTLLTPGATMIPACACKMVACIDPFVYAISHPRYRMELQKRCPWLALNEKAPESSAVASTSTTQEPQQTTAA* 0

>RH4_droMel Drosophila melanogaster (fruitfly) CG9668-RA two exons w large intron (no RM but intercolated genes)
0 MEPLCNASEPPLRPEARSSGNGDLQFLGWNVPPDQIQYIPEHWLTQLEPPASMHYMLGVFYIFLFCASTVGNGMVIWIFST
SKSLRTPSNMFVLNLAVFDLIMCLKAPIFIYNSFHRGFALGNTWCQIFASIGSYSGIGAGMTNAAIGYDRYNVITKPMNRNMTFTKAVIMNIIIWLYCTPWVVLPLTQFWDRFVP 1
2 EGYLTSCSFDYLSDNFDTRLFVGTIFFFSFVCPTLMILYYYSQIVGHVFSHEKALREQAKKMNVESLRSNVDKSKETAEIRIAKAAITICFLFFVSWTPYGVMSLI
GAFGDKSLLTPGATMIPACTCKLVACIDPFVYAISHPRYRLELQKRCPWLGVNEKSGEISSAQSTTTQEQQQTTAA* 0

>RH5_droMel Drosophila melanogaster (fruitfly) CG5279-RA two small introns also seen in Apis, Daphnia; first in Aplysia, Platynereis and Homo
0 MHINGPSGPQAYVNDSLGDGSVFPMGHGYPAEYQHMVHAHWRGFREAPIYYHAGFYIAFIVLMLSSIFGNGLVIWIFST 2
1 SKSLRTPSNLLILNLAIFDLFMCTNMPHYLINATVGYIVGGDLGCDIYALNGGISGMGASITNAFIAFDRYKTISNPIDGRLSYGQIVLLILFTWLWATPFSVLPLFQIWGRYQP 1
2 EGFLTTCSFDYLTNTDENRLFVRTIFVWSYVIPMTMILVSYYKLFTHVRVHEKMLAEQAKKMNVKSLSANANADNMSVELRIAKAALIIYMLFILAWTPYSVVALI
GCFGEQQLITPFVSMLPCLACKSVSCLDPWVYATSHPKYRLELERRLPWLGIREKHATSGTSGGQESVASVSGDTLALSVQN*

>RH7_droMel Drosophila melanogaster (fruitfly) CG5638-RA long N-terminal has M comp genomics support, EC074058 CO302368, 3 novel exons
0 MEAIIMTTLPNLTTDAGDSSFWLTGALSLSEMLANSSHSHSTGSTTSTAGSSATESSAVNVGKDHDKHVNDSVSTGLS 2
1 NYSNYPSYIHYRDKYDLSYIAKVNPFWLQFEPPKSSTFLIMAALYCLISVVGCVGNAFVIFMFANRKSLRTPANILVMNLAICDFLMLIKCPIAIYNNIKEGPALGDI 1
2 ACRLYGFVGGLSGTCAIGTLTAIALDRYNVVVHPLQPLRRCSRLRSYLIILLIWCYSFLFAVMPALDIGLSVYVPEGFLTTCSFDYLNKEMPARIFMALFFVAAYCIPLTSIVYSYFYILKVVFTASRIQSNKDKAKTEQ
KLAFIVAAIIGLWFLAWSPYAIVAMMGVFGLERHITPLGSMIPALFCKTAACVDPYLYAATHPRFRVEVRMLFYGRGVLRRVSTTRSSYMTRSRSSFTHRLRTSTTGEGGMGDHRMENYLMNNNLMMVPEETEENEEIVVVAEINNSISSVMEQSKF* 0

>RH7_droSim Drosophila simulans (fruitfly) chr3L:11530420 11532815 
0 MEAIIMTTLPNLTTDAGDSSFWLTGALSLSEMLANSSHGHSTGSTTSSAGSSATESSAVNVGKDHDKHVNDSVSTGLS 2
1 NYSNYPSYIHYRDKYDLSYIAKVNPFWLQFEPPKSSTFLVMAALYCLISVVGCVGNAFVIFMFANRKSLRTPANILVMNLAICDFLMLIKCPIAIYNNIKEGPALGDI 1
2 ACRLYGFVGGLSGTCAIGTLTAIALDRYNVVVHPLQPLRRCSRLRSYLIILLIWCYSFLFAVMPALDIGLSVYVPEGFLTTCSFDYLNKETPARIFMALFFVAAYCIPLTSIVYSYFYILKVVFTASRIQSNKDKAKTEQ
KLAFIVAAIIGLWFLAWSPYAIVAMMGVFGLERHITPLGSMIPALFCKTAACVDPYLYAATHPRFRVEVRMLFYGRGVLRRVSTTRSSYMTRSRSSFTHRLRTSTTGEGGMGDHRMENYLMNNNLMMVPEETEENEEIVVVAEINNSVSSVMEQSKF* 0

>RH7_droSec Drosophila sechellia (fruitfly) super_0:4344247 4346640 
0 MEAIIMTTLPNLTTDAGDSSFWLTGALSLSEMLANSSHGHSTGSTASSAGSSATESSAVNVGKDHGKHVNDSVSTGLS 2
1 NYSNYPSYIHYRDKYDLSYIAKVNPFWLQFEPPKSSTFLVMAALYCLISVVGCVGNAFVIFMFANRKSLRTPANILVMNLAICDFLMLIKCPIAIYNNIKEGPALGDI 1
2 ACRLYGFVGGLSGTCAIGTLTAIALDRYNVVVHPLQPLRRCSRLRSYLIILLIWCYSFLFAVMPALDIGLSVYVPEGFLTTCSFDYLNKETPARIFMALFFVAAYCIPLSSIVYSYFYILKVVFTASRIQSNKDKAKTEQ
KLAFIVAAIIGLWFLAWSPYAIVAMMGVFGLERHITPLGSMIPALFCKTAACVDPYLYAATHPRFRVEVRMLFYGRGVLRRVSTTRSSYMTRSRSSFTHRLRTSTTGEGGICDHRMENYLMNNNLMMVPEETEENEEIVVVAEINNSVSSVMEQSKF* 0

>RH7_droYak Drosophila yakuba (fruitfly) chr3L:12207286 12209654 
0 MEAIIMTTLPALTTDAGDSSSFWLTGALSLSEMLANSSHGHSTGSTSSTAGSSATESSTVNVGKDHDVTKHVNDSVSTGLS 2
1 NYSNYPSYIHYRDKYDLSYIAKVNPFWLQFEPPKSSTFLVMAALYFLISVVGCVGNAFVIFMFANRKSLRTPANILVMNLAICDFLMLIKCPIAIYNNIKEGPALGDI 1
2 ACRLYGFVGGLSGTCAIGTLTAIALDRYNVVVHPLQPLRRCSRLRSYLIILLIWCYSFLFAVMPALDIGLSVYVPEGFLTTCSFDYLNKETPARIFMALFFVAAYCIPLTSIVYSYFYILKVVFTASRIQSNKDKAKTEQ
KLAFIVAAIIGLWFLAWSPYAIVAMMGVFGLERHITPLGSMIPALFCKTAACVDPYLYAATHPRFRVEVRMLFYGRGVLRRVSTTRSSYMTRSRSSFTHRLRTSTTGDGGMGDHRMENYLMNNNLMMVPEETEENEEIVVVAEINNSVSSVMEQSKF* 0

>RH7_droEre Drosophila erecta (fruitfly) scaffold_4784:12148112 12150459 
0 MEAIIMTTLPTLTTDAGDSSFWLTGALSLSEMLANSSHGHSTGSTSSTAGSSATESATVNVGKDHDVAKHVNDSVSTGLS 2
1 NYSNYPSYIHYRDKYDLSYIAKVNPFWLQFEPPKSSTFLVMAALYCLISVVGCVGNAFVIFMFANRKSLRTPANILVMNLAICDFLMLIKCPIAIYNNIKEGPALGDI 1
2 ACRLYGFVGGLSGTCAIGTLTAIALDRYNVVVHPLQPLRRCSRLRSYVIILLIWCYSFLFAVMPALDIGLSVYVPEGFLTTCSFDYLNKETPARIFMALFFVAAYCIPLTTIVYSYFYILKVVFTASRIQSNKDKAKTEQ
KLAFIVAAIIGLWFLAWSPYAIVAMMGVFGLERHITPLGSMIPALFCKTAACVDPYLYAATHPRFRVEVRMLFYGRGVLRRVSTTRSSYMTRSRSSFTHRLRTSTTGDGGMGDHRMENYLMNNNLMMVPEETEENEEIVVVAEINNSVSSVMEQSKF* 0

>RH7_droAna Drosophila ananassae (fruitfly) scaffold_13337:1483455 1485125+ frameshifted
0 MEAIILSTLPSLTTNASGSSSHWLTGALSLPEILANSSGSPNTSSADTGSGINLSARDADRHFNISTEAR 2
1 NYSYYPGYIHYRDKYDLSYIAKVNPFWLQFEPPHSSTFLAMAALYCLISVVGCVGNAFVIFMFANRKSLRTPANILVMNLAICDFLMLVKCPIAIYNNIKEGPALGDV 1
2 ACRIYGFVGGLSGTCAIGTLTAIALDRYNVVVHPLQPLRRCSRLRSYLIILLIWCYSFLFAVMPALDIGLSVYVPEGFLTTCSFDYLNKETPARIFMALFFVAAYCFPLTAIVYSYFYILKVVFSAGRIQSNKDKAKTEQ
KLAFIVAAIIGLWFLAWSPYAVVAMMGVFGLEKHITPLGSMIPALFCKTAACVDPYLYAATHPRFRVEVRMLFYGRGILRRVSTTRSSYMTRSRSSFTHPAGRADGGTGRDHRMETYLMNNNLMMVPEETEENEEIVVVAEINNSVSSAIEQSKF* 0

>RH7_droPse Drosophila pseudoobscura (fruitfly) chrXR_group6:2491547 2493151 
0 MEALMAALPTLTTEAAGSSLWLTSALSLSEMLANSSTSPNASLVAATTSSAAVATASTTSAAEAVGKVPDKHEVNDNVSTVLS 2
1 TSSSYPGYIHYRDKYDLSYIARVNPFWLQFEPPKSSTFYLMAALYCLISVVGCVGNAFVIFMFANRKSLRTPANILVMNLAICDFLMLVKCPIAIYNNIKEGPALGDA 1
2 ACRIYGFVGGLSGTCAIGTLTAIALDRYNVVVHPLQPLRRCSRLRSYLIIFLIWSYSFLFAVMPALDIGLSVYVPEGFLTTCSFDYLNKETPARIFMALFFVAAYCVPLTTIVYSYFYILKVVFTASRIQSNKDKAKTEQ
KLAFIVAAIIGLWFLAWSPYAIVAMMGVFGQERHITPLGSMIPALFCKTAACVDPYLYAATHPRFRVEVRMLFFGRGVLRRVSTTRSSYMTRSRSSFNRRLRPTPDAEHRVESYLMNNNLMMVPEETEENEEIVVVAEFNNSSYSGMEQSKF* 0

>RH7_droPer Drosophila persimilis (fruitfly) super_9:783822 785423 
0 MEALMAALPTLTTEAAGSSLWLTSALSLSEMLANSSTSPNASLVATTSSAAAATASTTSAAEAVGKVPDKHEVNDNVSTVLS 2
1 SYPGYIHYRDKYDLSYIARVNPFWLQFEPPKSSTFYLMAALYCLISVVGCVGNAFVIFMFANRKSLRTPANILVMNLAICDFLMLVKCPIAIYNNIKEGPALGDA 1
2 ACRIYGFVGGLSGTCAIGTLTAIALDRYNVVVHPLQPLRRCSRLRSYLIIFLIWSYSFLFAVMPALDIGLSVYVPEGFLTTCSFDYLNKETPARIFMALFFVAAYCVPLTTIVYSYFYILKVVFTASRIQSNKDKAKTEQ
KLAFIVAAIIGLWFLAWSPYAIVAMMGVFGQERHITPLGSMIPALFCKTAACVDPYLYAATHPRFRVEVRMLFFGRGVLRRVSTTRSSYMTRSRSSFNRRLRPTPDAEHRVESYLMNNNLMMVPEETEENEEIVVVAEFNNSSYSGMEQSKF* 0

>RH7_droWil Drosophila willistoni (fruitfly) scaffold_180949:5140016 5141994+
0 MDMDMALDMNDAATTTSLWITSAALSLSEILVNTTSHVVTTSPASTSTVETTAVAAVTATGKVVHDDEKHHHHHHHHHQDEVNDNNVTTVLR 2
1 NFSSYPGYIHYRDKYDLSYIAKVNPFWLQFEPPRSSTFYIMAALYCLISVVGCIGNAFVIFMFSNRKSLRTPANILVMNLAICDFLMLVKCPIAIYNNIKEGPALGDI 1
2 ACRIYGFVGGLSGTCAIGTLTAIALDRYNVVVHPLQPLRRCSRLRSYLVIVMIWCYSFLFAIMPALDVGLSVYVPEGYLTTCSFDYLNKETPARIFMALFFVAAYCVPLTCIMFSYFYILKVVFTANRIQSNKDKAKTEQ
KLTFIVAAIIGLWFLAWSPYAVVAMMGVFGLEQHITPLGSMIPALFCKTAACVDPYLYAATHPRFRVEVRMLIYGRGVLRRVSTTRSSYITRSRSSFTRRLRTGSELDMRTEPYIMNNNLMMVPEETEENEEIVVVAEINNPSRCVSMHEHTSKF* 0

>RH7_droVir Drosophila virilis (fruitfly) scaffold_13049:6123835 6125790+
0 METIMSTFPTLTSDDGSLWITSALSEMLTSSSSNSSEAAQNATLVAAAAATTTTVAAAAAAAAANASTAATANVTKVHDKHSHAVNDSETDLR 2
1 CSAYPGYIHYRDKYDLDYIAKVNPFWLQFEPPGTSSFYIMAGLYCLISVVGCFGNAFVIFMFGSRKSLRTPANILVMNLAICDFLMLIKCPIAIYNNIQEGPALGDM 1
2 ACRIYGFVGGLSGTCAIGTLTAIALDRYNVVVHPLQPLRRYSRLRSYLIIILIWCYSFLFAVMPALDVGLSVYVPEGYLTTCSFDYLNKETPARIFMALFFVAAYCIPLISIVYSYFYILKVVFMANRIQSNKDKAKTEQ
KLTFIVAAIIGLWFIAWSPYAIVAMMGVFGQEQHITPLGSMIPALFCKTAACVDPYLYAATHPRFRVEVRMLMYGRGALRRVSTTRSTYMTRSFTHRMRHTSGDGENRADPYTLNNNLMMVPEETEENDEIIVVAEINNSTSIAMEQSKF* 0

>RH7_droMoj Drosophila mojavensis (fruitfly) scaffold_6680:4445619 4446890+
0 METIMSTLPTLTADDGSLWITSALTELLASGANSSSGSSSVVADGTQNATFVAAATTTTTTVAAAAAAAAAAAVNASTATTANATKGHHKHPHGVNDSETDLR 2
1 LCSSYPGYIHYRDKYDLTYIAKVNPFWLQFEPPDTSTFYIMAALYCLISVVGCVGNAFVIFMFGSRKSLRTPANILVMNLAICDFLMLVKCPIAIYNNIQEGPALGDA 1
2 ACRLYGFVGGLSGTCAIGTLTAIALDRYNVVVHPLQPLRRYSRLRSYLIIFAIWCYSFLFAVMPALDVGLSVYVPEGFLTTCSFDYLNKETPARIFMALFFVAAYCIPLASIVYSYFYILKVVFTANRIQSSKDKAKTEQ
KLTFIVAAIIGLWFIAWSPYAIVAMMGVFGQEQHITPLGSMIPALFCKTAACVDPYLYAATHPRFRVEVRMLMYGRGVLRRVSTTRSSYMTRSRSSFTHRLRPSSGDCENRAEPYTLNNNLMMVPEETEENEEIIVVAEINNSISGVMEQSKF* 0

>RH7_droGri Drosophila grimshawi (fruitfly) scaffold_15110:6598464 6600409 
0 METIMSTLPTLAADDGSQWLTSALSEVLASSDGRGAAQNATLAAATAVATATTAVNVSKVDDKHLHTVNDSDTDLT 2
1 RCSSYPGYIHYRDKYDLTYIAKVNPFWLQFEPPDTSTFYMMAGLYCLISVVGCFGNAFVIFMFVSRKSLRTPANILVMNLAICDFLMLVKCPIAIYNNINEGPALGDA 1
2 ACRLYGFVGGLSGTCAIGTLTAIALDRYNVVVHPLQPLRRFSRLRSYFIIFLIWCYSFVFAVTPALDVGLSVYVPEGYLTTCSFDYLNKDTPARIFMALFFVAAYCIPLTCIVYSYFYILKVVFTANRIQSSKDKAKTEQ
KLTFIVAAIIGLWFIAWSPYAIVAMMGVFGLEQHITPLGSMIPALFCKTAACVDPYLYAATHPRFRVEVRMIFFGRGVLRRVSTTRSSYMTRSRSSFNHRVRSSSNEGDNRAESYKMNNNLMIVPEETDENEEIIVVAEINNSISIDMEQSKF* 0

Hexapoda: Anopheles gambiae (mosquito) .. 5 opsins

Anopheles is one of several mosquitoes with significant amounts of genome sequencing. It is notable for retaining the arthropod ciliary opsin as well as blue, standard UV and RH7 UV ortholog (which in contrast to fellow dipteran Drosophila, has ancestral intronation).

>UV7_anoGam Anopheles gambiae (mosquito) Diptera XM_308329
0 MGRQGSGNAVRISPSSRNQPYFSSAHLSFVVPFPVHSKYVVRSGYVLPVDPLFVAKINPFWLRFDPPSAGEHYGLAVFYFLMMLFGVIGNALVVFMFYR 2
1 YRSLRTPANYLVINLAVADFIIMMEAPMFIYNSIHQGPALGSI 1
2 GCTVYALMGAVGGTVAIATLTVISIDRYNVVVYPLNPNRSTTKLKCYFLIAFTWAYGLLFASFPALEIGLSRYTAEGYLTACSFDYLDRTYKARVFMFVYFVFAW
LIPFAIISYCYARILIAVINANAIQSSKSKNKTEVKLAGVVVGIIGLWFAAWTPYAVVAMMGVFGYEQYLTPLNSMIPAVFAKIAASIDPYFYAMNHPRYRQMLER
MFCNRGADQGNSQYQTSHYTRGASRGGDSEGGGGEESGGGGGVGRAPGGGNAGLGRGGTVRGGGGGGRLIAGKGGGGANATGSTGGGGVKALKKQISNGDETSLEVSLEM* 0

>UV5_anoGam Anopheles gambiae (mosquito) Diptera XM_556823 novel short exon
0 MGLVQLDNQTAYRPEALIGADQSGLRYLGWNVPPEELVHIPEHWLQFPEPEASLHYLLGLLYIAFTIFSLVGNGLVIWIFIA 2
1 AKSLRTPSNVFVINLAICDFFMMAKTPIFIYNSFTKGFTLGNLGCQIFGFVGSLT 1
2 GIGAGATNALIAYDR 2
1 YNTITRPFEGRLTQTKAIIFICLIWAYTIPWGVLPLLEIWGRYVP 1
2 EGFLTSCTFDYLSGTFDTRLFVASIFTFSYVLPMSLIIYYYSQIVSHVVNHEKSLREQAKKMNVESLRSNQNQKDASVEIRIAKAAITVC
FLFVASWTPYAVLALIGAFGDKSLLTPGVTMFPACACKFVACLDPYVYAISHPRYRIELQKRLPWLAITETLPAENASTCTEQQDGNATTQS* 0

>UVB_anoGam Anopheles gambiae (mosquito) Diptera XM_312478
0 MFLGNESISEGAMLMPMARTAGEMPKLLGWNLPPEEQYLVHDHWKGFPSPPYYMHLMLAMIYFVLMNTSLIGNGIVLWIFGT 2
1 SKSLRNGSNMFIINLAIFDLLMMCEMPMFLVNSFSERLVGYGVGCSVYAALGSMSGIGGAISNAVIAFDRYRTISNPLDGRLSRVQAGLLICLTWLWTMPFTLLPLFEIWGRY
IPEGYLTTCSFDYLTDDPDTRVFVGCIFTWAYVIPMIFICYFYARLFGHVRQHEMMLKNQARKMNVESLTANRSEKAQAVEMRIAKAAFTIFFLFVCAWTPYAIVTMIGAFGDR 2
1 TMLTPFVTMVPAVCCKIVSCLDPWVYAISHPKYRQELERRLPWMGIKEADDSVSTTES* 0

>LWS_anoGam Anopheles gambiae (mosquito) Diptera XM_319247 most introns obliterated
0 MPYYGPMQQPGLWGQPVANLTVVDKVPPEIMHLVDPHWSQFPPMNPLWHSIIGFVIFVLGVVSIIGNGMVIYIFSTAKSLRTPSNLFIVNLALSDFLMMGTN
AFTMVYNCWFETWSLGLLMCDLYAFFGSLFGCCSIWTMTMIALDRHNVIVHGLSGKPLTNTGAILRILLCWLIGVVWGILPMLGWNRYVPEGNMTACGTDYLTDDWFHKSYILVYS
VFVYYTPLFTIIYAYFFIIK 0
0 AVSAHEKNMREQAKRMNVQSLRSSDDGKSTEMKLAKVALVTISLWFMAWTPYTVINYTGVFKTASITPLATIWGSVFAKANAVYNPIVYGISHPKY
RAALLRRFPSLACSDGPPADDKSLASEASGITSAGNPTTA* 0

>TMT1_anoGam Anopheles gambiae (mosquito) Gt encephalopsin-class ciliary 461 aa 000 nm no_ref XM_312503 encephalopsin GPROP11 adjacent head-to-head tandem GPROP12   
0 MYDVTDAAAINSDHQELMAPWAYNGAAVTLFFIGFFGFFLNIFVIALMYKDVQ 0
0 LWTPMNIILFNLVCSDFSVSIIGNPLTLTSAISHRWLYGKSICVAYGFFMSLL 1
2 GIASITTLTVLSYERFCLISRPFAAQNRSKQGACLAVLFIWSYSFALTSPPLFGWGAYVNEAANIS 2
1 CSVNWESQTANATSYIIFLFIFGLILPLAVIIYSYINIVLEMRK 0
0 NSARVGRVNRAERRVTSMVAVMIVAFMVAWTPYAIFALIEQFGPPELIGPGLAVLPALVAKSSICYNPIIYVGMNTQ FRAAFWRIRRSNGVAGQPDSNNTNNSNRDKESARHTAKEGL
ECSLDFCHWTVRGTRVSISSAERNVPAPAARERSGGHSVTGSREESRDRHVTLKTMLSVGPRSPSSVAPVAADCSTTDVPTSGDGSVRIVRQDSELSVIHDGGGGGGGSSSRVLVIKSQKPRSNML* 0

Hexapoda: Apis mellifera (bee) .. 4 opsins

Bee genome has proven quite instructive in terms of ancestral information, in terms of both gene retention and conservation of intron patterns. The transcript situation is still poor however. Apis has five opsins, including a ciliary (pteropsin) opsin but lacks an RH7 ortholog. The ciliary opsin was localized to head but never pinpointed anatomically, prohibiting comparsions to Platynereis.

>UV5_apiMel Apis mellifera (bee) AF004169 353 nm 5 exons Arthropoda Insecta complete genNow
0 MSNDSIHWEARYLPAGPPRLLGWNVPAEELIHIPEHWLVYPEPNPSLHYLLALLYILFTFLALLGNGLVIWIFCA 2
1 AKSLRTPSNMFVVNLAICDFFMMIKTPIFIYNSFNTGFALGNLGCQIFAVIGSLTGIGAAITNAAIAYDRYS 2
1 TIARPLDGKLSRGQVILFIVLIWTYTIPWALMPVMGVWGRFVPEGFLTSCSFDYLTDTNEIRIFVATIFTFSYCIPMILIIYYYSQIVSHVVNHEKALREQAKKMNVDSLRSNANTSSQSAEIRIAK 0
0 AAITICFLYVLSWTPYGVMSMIGAFGNKALLTPGVTMIPACTCKAVACLDPYVYAISHPKYR 2
1 LELQKRLPWLELQEKPISDSTSTTTETVNTPPASS* 0

>UVB_apiMel Apis mellifera AF004168 439 nm 8 exons Arthropoda Insecta complete genNow
0 MLLHNKTLAGKALAFIAEEG 2
1 YVPSMREKFLGWNVPPEYSDLVHPHWRAFPAPGKHFHIGLAIIYSMLLIMSLVGNCCVIWIFST 2
1 SKSLRTPSNMFIVSLAIFDIIMAFEMPMLVISSFMERMIGWEIGCDVYSVFGSISGMGQAMTNAAIAFDRYR 2
1 TISCPIDGRLNSKQAAVIIAFTWFWVTPFTVLPLLKVWGRYTT 1
2 EGFLTTCSFDFLTDDEDTKVFVTCIFIWAYVIPLIFIILFYSRLLSSIRNHEKMLREQ 0
0 AKKMNVKSLVSNQDKERSAEVRIAKVAFTIFFLFLLAWTPYATVALIGVYGNR 2
1 ELLTPVSTMLPAVFAKTVSCIDPWIYAINHPR 2
1 YRQELQKRCKWMGIHEPETTSDATSAQTEKIKTDE* 0

>LWSa_apiMel Apis mellifera (bee) Gq 386 aa 16291092 NM_001077825 rhabdomeric AmLop2 long wavelength ocelli not compound 
0 MDTLNITTSFFIEVMPSNISTLTTTGPQFARQLMRFNNQTVVSKVPEEMLHLIDLYW 2
1 YQFPPLDPLWHKILGLVMIILGIMGWCGNGVVVYVFIMTPSLRTPSNLLVVNLAFSDFIMMGFMCPPMVICCFYETW 0
0 VLGSLMCDIYAMVGSLCGCASIWTMTAIALDRYNVIVK 0
0 GMSGTPLTIKRAMLQILGIWLFGLIWTILPLVGWNR 2
1 YVPEGNMTACGTDYLSQDWTFKSYILVYSFFVYYTPLFTIIYSYYFIVS 0
0 AVAAHEKAMKEQAKKMNVTSLRSGDNQNTSAEAKLAK 0
0 VALTTISLWFMAWTPYLVINYIGIFNRSLITPLFTIWGSLFAKANAIYNPIVYGIS 2
1 HPKYRAALKEKLPFLVCGSTEDQTAATAGDKASEN* 0

>LWSb_apiMel Apis mellifera U26026 529 5 exonsArthropoda Insecta 540 complete genNow
0 MIAVSGPSYEAFSYGGQARFNNQTVVDKVPPDMLHLIDANWYQYPPLNPMWHGILGFVIGMLGFVSVMGNGMVVYIFLSTKSLRTPSNLFVINLAISDFLMMFCMSPPM 0
0 VINCYYETWVLGPLFCQIYAMLGSLFGCGSIWTMTMIAFDRYNVIVKGLSGKPLSINGALIRIIAIWLFSLGWTIAPMFGWNR 2
1 YVPEGNMTACGTDYFNRGLLSASYLVCYGIWVYFVPLFLIIYSYWFIIQAVAAHEKNMREQAKKMNVASLRSSENQNTSAECKLAK 0
0 VALMTISLWFMAWTPYLVINFSGIFNLVKISPLFTIWGSLFAKANAVYNPIVYGIS 2
1 HPKYRAALFAKFPSLACAAEPSSDAVSTTSGTTTVTDNEKSNA* 0

>TMT_apiMel Apis mellifera (bee) Gt ciliary 329 aa 16291092 NM_001039968 ciliary AmLop2 compound eye not ocelli pteropsin clock   
0 MSLNRSTMEHVIYEDQVSPVMYIGAAIALGFIGFFGFTANLLVAIVIVKDAQILWTPVNVILFNLV 0
0 FGDFLVSIFGNPVAMVSAATGGWYWGYKMCLW 2
1 YAWFMSTLGFASIGNLTVMAVERWLLVARPMQALSIR 2
1 HAVILASFVWIYALSLSLPPLFGWGSYGPEAGNVSCSVSWEVHDPVTNSDTYIGFLFVLGLIVPVFTIVSSYAAIVLTLKKVRKRA 1
2 GASGRREAKITKMVALMITAFLLAWSPYAALAIAAQYFN 0
0 AKPSATVAVLPALLAKSSICYNPIIYAGLNNQFSRFLKKIFDARGSRTAVPDSQHTALTALNRQEQRK* 0

Hexapoda: Nasonia vitripennis (jewel_wasp) .. 4 opsins

The jewel wasp genome contains 4 opsins: one each for UV and blue and a facing tandem pair --><-- with i kbp separation for long wavelength. No RH7-type UV nor ciliary opsin is present at the current level of coverage, even though the later is present in another Hymenopteran, the bee.

The two LWS paralogs are intronated somewhat differently. Using outgroups, it can be seen that 4 events (two intron losses and two gains) are needed to synchronize intron patterns. None of these events happened in Nasonia because they also occur in Apis. Two others go back at least to the common ancestor with chelicerates.

>UV5_nasVit Nasonia vitripennis (jewel_wasp) XM_001608024 wrong, transcripts GE436449 GE390962, very similar Apis
0 MPYYNWNGTDQTAGWPEARIQPAGAPRLLGWNVPPEELVHIPEHWLVYPEPNPALHYLLALLYILFTFVALLGNGLVIWIFCA 2
1 AKSLRTPSNMFVVNLAICDFMMMLKTPIFIYNSFHTGFALGNLGCQIFSFIGSLSGIGASITNAAIAYDRYS 2
1 TIARPLDGKLSRGQVMMLIVLIWMYTIPWALMPSMGVWGRFVP EGFLTSCTFDYITDSDEIRYFVGTIFTFSYAIPMTLIIYFYSQIVGHVVNHEKALREQAKKMNVESLRSGQNKDQASAEVRIAK 0
0 VALTICFLFVAAWTPYGVMSLIGAFGNK SLLTPGVTMIPACCCKAVACLDPYVYAISHPRYR 2
1 LELQKRMPWLELQEKPPASDATSTTTEAVPASS* 0

>UVB_nasVit Nasonia vitripennis (jewel_wasp) XM_001604572 ES636068
0 MAFVGLNGAMGGMGPA 1
2 EKPLQRYSQGPQMQEHLLGWNHPPEHIDIVHPHWRGFLAPGKYWHIGLALIYFMLLVLSFVGNGCVVWIFST 2
1 SKVLRTPSNLFIINLALFDLVMALEIPMLIINSFIERMIGWGLGCDIYAALGSVSGIGSAITNAAIAYDRYR 2
1 TISCPIDGRLNGKQAAVMVAFTWFWTMPFTILPFAKIWGRYTT 1
2 EGFLTTCSFDFLSDDQDTKVFVAAIFSWSYCFPMVLIIYFYSQLIKSVRRHEKMLREQ 0
0 QAKKMNVKSLSAQDKERSVEMRIAKVAFTIFFLFVCSWTPYAVVTMIAAFGNR 2
1 ELVTPFSSMLPAVFAKTVSCIDPWVYAINHPR 2
1 YRQELTKRCQWMGIHEPDSGPSQNNAEAVSVTTEKLKSDDA* 0

>LWSa_nasVit Nasonia vitripennis (jewel_wasp) XM_001606013 GE417061 22063-23541 - strand of AAZX01007316  -->1 kbp <--
0 MGPSFLTLTAMAQRGGYGGGGGFGGGFNNQTVVDKAPPEIHHMIDPYWYQFPPMNPLWYGILGFVIGCLGCISVAGNGMVVYIFASTKSLRTPSNLLVINLAFSDFCMMFTMSPPM 0
0 VINCYYETWVFGPLMCEIYALCGSIFGCGSIWTMCMIAFDRYNVIVKGLSAKPMTINGSLLRILGIWLMASIWTIAPMFGWNR 2
1 YVPEGNLTACGTDYFSKDWVSRSYIVVYSFFVYFLPLFMIIYSYYFIIKAVSAHEKNMREQAKKMNVASLRQGDSQSAENKLAK 0
0 IALMTISLWFMAWTPYLVINWAGIFDLARLTPLFTIWGSVFAKANAVYNPIVYGIS 2
1 HPKYRAALFARFPSLACAGDAPAGAASDAVSTTSGVTTLTDHDKSNA* 0 

>LWSb_nasVit Nasonia vitripennis (jewel_wasp) tandem pair to LWSa, fairly diverged 19237-21046 + strand of AAZX01007316
0 MEHPIVAAGVNATGEFDASSGSASSTTTMVTTAAVQVASTIGPHFARQVMRGFGNLTVVDKVPPEMLHLVGPHW 2
1 YQFPPLWPIWHKLLGVVMIFIGVLGWCGNGMVVYIFLVTPSLRTPSNLLVINLAFSDFVMMIIMSPPMVVNCWYETW 0
0 ILGPLMCDIYALIGSLCGGASIWTMTAIAYDRYNVIVK 0
0 GMSGTPLTIPRALVQIVLIWTHGLIWAMLPLFGWNR 2
1 YVPEGNMTSCGTDYVSDDWLGKSYILVYSIFVYYTPLFSIILCYWHIVS 0
0 AVAAHERGMREQAKKMNVASLRSGDQSGESAEVKLAK 0
0 VAVTTISLWFLAWTPYLVTNYMGIFAKQHVSPLFTIWASLFAKTNACYNPIVYGIS 2
1 HPKYRAGLKVKCPCLVFGDTEDKPKPAAATPAADAASTHSKA* 0

Arthropod opsin gene tree .. 79 opsins

Unalignable N- and C-terminal residues are trimmed off below. The gene tree below arises from their alignment. Note that lophotrochozoan and deuterostome melanopsins cluster together to the exclusion of arthropod genes. The latter fall into two primary clusters of UV and long wavelength. The Rh7 group of UV opsins diverges fairly early within the gene tree. The sole cnidarian gene in this class does not quite form an outgroup but instead nests within ecdysozoan melanopsins. Various outliers in Branchiopoda might indicate the beginning of new sub-clades but the more basal Chelicerates need far better representation.

The nomenclature used here seeks to convey both gene classification and peak wavelength in a few letters that additionally avoid conflict with deuterostome gene names and bow somewhat to Drosophila opsin numbering (where all ecdsozoan genetic work takes place). Thus UV7 and UV5 consist of ultraviolet-peaking opsins closely related to Drosophila Rh7 and Rh5, respectively. If the lysine determinant at position 90 is a blue-shifting residue instead, that is denoted by UVB. Such substitutions may have occured in both directions multiple times. Similarly long and middle wavelength sensitivity is denoted as LMS. The BCR series derives from founder sequences BcRh1 in the crab Hemigrapsus. The fasta header of the reference sequences contains various literature and site synonyms. When in doubt, a simple text search of 4-5 residues will resolve nomenclature uncertainty.

Species with opsin data (taxa taken from GenBank taxonomy). Note many important groups (eg myriapods and onychophorans) have no opsin data.
 
Insecta  Pterygota Neoptera Paraneoptera   Hemiptera    Acyrthosiphon
Insecta  Pterygota Neoptera Paraneoptera   Hemiptera    Rhodnius
Insecta  Pterygota Neoptera Paraneoptera   Hemiptera    Homalodisca
Insecta  Pterygota Neoptera Paraneoptera   Hemiptera    Megoura
Insecta  Pterygota Neoptera Paraneoptera   Phthiraptera Pediculus
Insecta  Pterygota Neoptera Endopterygota  Diptera      Drosophila
Insecta  Pterygota Neoptera Endopterygota  Diptera      Anopheles
Insecta  Pterygota Neoptera Endopterygota  Hymenoptera  Apis
Insecta  Pterygota Neoptera Endopterygota  Hymenoptera  Nasonia
Insecta  Pterygota Neoptera Endopterygota  Coleoptera   Tribolium 
Insecta  Pterygota Neoptera Endopterygota  Coleoptera   Luciola
Insecta  Pterygota Neoptera Endopterygota  Lepidoptera  Manduca
Insecta  Pterygota Neoptera Endopterygota  Lepidoptera  Papilio
Insecta  Pterygota Neoptera Orthopteroidea Orthoptera   Schistocerca
Insecta  Pterygota Neoptera Orthopteroidea Orthoptera   Dianemobius
 
Crustacea Branchiopoda Phyllopoda     Diplostraca       Daphnia
Crustacea Branchiopoda Phyllopoda     Notostraca        Triops
Crustacea Branchiopoda Sarsostraca    Anostraca         Branchinella
Crustacea Malacostraca Eumalacostraca Eucarida          Hemigrapsus
Crustacea Malacostraca Eumalacostraca Eucarida          Portunus
Crustacea Malacostraca Eumalacostraca Hoplocarida       Neogonodactylus

Chelicerata     Merostomata   Xiphosura                 Limulus
Chelicerata     Arachnida     Acari                     Ixodes
Chelicerata     Arachnida     Araneae                   Plexippus
Chelicerata     Arachnida     Araneae                   Hasarius

ArthrOpsins.jpg

In the alignment, red indicates residues conserved in almost all opsins and even GPCR, blue residues less conserved but sometimes indicative of opsin class, indels are often diagnostic.
Landmarks are marked up including ultraviolet K90, DRY, Schiff K, NPxxYxxxxxFR, transmembrane regions, informative indels, phyloSNPs etc. Intron boundaries are also be informative.
Special res ..........................................................................UV.............................................DRY...........................................
UV7 diagnos ....................................................................................................................................++..............................G..
UV  diagnos ..........................................................................-......................................................P.................................+...
Location    ETETETETETETETETETETETM1M1M1M1M1M1M1M1M1M1M1M1MC1C1C1C1C1C1M2M2M2M2M2M2M2M2M2M2ME1E1E1E1E1E1E1E1E1E1E1E1EM3M3M3M3M3M3M3M3C2C2C2C2C2C2C2C2C2C2M4M4M4M4M4M4M4M4M4M4ME2E2E

UV7_aedAeg  EDAFRDRINPFWLQFDPPSRTAHYILGFIYFMMMMFGLCGNLLVILMFFRFKSLRTPANYLVINLAIADFIIML-EAPLFVYNSY--HQGPATGNVWCTIYALLGAVGGTVAIVTLTMISIDRYNVVVYPLNPKRSTTRLKVALMIVFAWIYGLVFSVIPALDIGLS
UV7_culQui  EDAFRDRINPFWLQFEPPSPVAHYALGFVYFLMMVWGLFGNVLVIFMFFKFKSLRTPANYLVINLAVADFLIML-EAPIFVYNSY--HLGPAFGNTLCTIYSLLGAIGGTVAIMTLTMISVDRYNVVVYPLNPNRSTTRLKVMLMIVFTWIYALVFSLMPALEIGLS
UV7_anoGam  DPLFVAKINPFWLRFDPPSAGEHYGLAVFYFLMMLFGVIGNALVVFMFYRYRSLRTPANYLVINLAVADFIIMM-EAPMFIYNSI--HQGPALGSIGCTVYALMGAVGGTVAIATLTVISIDRYNVVVYPLNPNRSTTKLKCYFLIAFTWAYGLLFASFPALEIGLS
UV7_droMel  DLSYIAKVNPFWLQFEPPKSSTFLIMAALYCLISVVGCVGNAFVIFMFANRKSLRTPANILVMNLAICDFLMLI-KCPIAIYNNI--KEGPALGDIACRLYGFVGGLSGTCAIGTLTAIALDRYNVVVHPLQPLRRCSRLRSYLIILLIWCYSFLFAVMPALDIGLS
UV7_droYak  DLSYIAKVNPFWLQFEPPKSSTFLVMAALYFLISVVGCVGNAFVIFMFANRKSLRTPANILVMNLAICDFLMLI-KCPIAIYNNI--KEGPALGDIACRLYGFVGGLSGTCAIGTLTAIALDRYNVVVHPLQPLRRCSRLRSYLIILLIWCYSFLFAVMPALDIGLS
UV7_droAna  DLSYIAKVNPFWLQFEPPHSSTFLAMAALYCLISVVGCVGNAFVIFMFANRKSLRTPANILVMNLAICDFLMLV-KCPIAIYNNI--KEGPALGDVACRIYGFVGGLSGTCAIGTLTAIALDRYNVVVHPLQPLRRCSRLRSYLIILLIWCYSFLFAVMPALDIGLS
UV7_droPse  DLSYIARVNPFWLQFEPPKSSTFYLMAALYCLISVVGCVGNAFVIFMFANRKSLRTPANILVMNLAICDFLMLV-KCPIAIYNNI--KEGPALGDAACRIYGFVGGLSGTCAIGTLTAIALDRYNVVVHPLQPLRRCSRLRSYLIIFLIWSYSFLFAVMPALDIGLS
UV7_droWil  DLSYIAKVNPFWLQFEPPRSSTFYIMAALYCLISVVGCIGNAFVIFMFSNRKSLRTPANILVMNLAICDFLMLV-KCPIAIYNNI--KEGPALGDIACRIYGFVGGLSGTCAIGTLTAIALDRYNVVVHPLQPLRRCSRLRSYLVIVMIWCYSFLFAIMPALDVGLS
UV7_droMoj  DLTYIAKVNPFWLQFEPPDTSTFYIMAALYCLISVVGCVGNAFVIFMFGSRKSLRTPANILVMNLAICDFLMLV-KCPIAIYNNI--QEGPALGDAACRLYGFVGGLSGTCAIGTLTAIALDRYNVVVHPLQPLRRYSRLRSYLIIFAIWCYSFLFAVMPALDVGLS
UV7a_acyPi  TDDYIKLINSHWLKFMPPNPTSHYVLGLLYTVIMVFGCTGNSLVIFMYFKCRSLQTPANMLIINLAVSDFIMLA-KASVFIYNSY--YLGPALGKLGCQVCGFLGGLTGTVSIMTLAAISLDRYYVIVCPLKAAVKTTKQRARIWIGLIWIYGFSFSIVPVLDLGYS
UV7b_acyPi  TDDYMKLINSHWFKFMPPNATSHYILGFLYSVIMVLGCFGNSLVIFMYIKCKSLQTPANVLIMNLAVSDFIMLA-KTPVFIYNSF--YQGPTLGKLGCQIYGFFGGLTGTVSIMTLAAISLDRYYVIVHPLNAAVKTTKQRARVWIGLIWIYGFLFSIIPVMDLGYN
UV7_rhoPro  TEEYLKLVNTHWFEYPPPNKQIHYIFAAVYFLVMLVGVSGNLLVIFMILRFRTLRTSSNILILNLAVSDFLMVA-KMPVFIYNSF--YFGPVLGEMGCHFYGFIGGLSGTASILTLAAIAMDRYLGIAHPLNFNQGRAKKRTIVWITFIWVYSITFASIPLSHIGVK
UV7_pedHum  DDEYLYKINKYWMKFPPPSPMSHYFMGIIYSVIMVVGVFGNFLIIYLFLRKRSLRTPSNVFIFNLAVSDSLLLL-KMPVFIINSF--YLGPALGNLGCSAYGFVGGLTGTVSIMTLAAIAFDRYQVIVHPLE---RKTKAAVYFQILLIWIYAIFFSIIPLLDVGLN
UV7_ixoSca  TEEYLKLVNTHWFEYPPPNKQIHYIFAAVYFLVMLVGVSGNLLVIFMILRRRRIRSQANLLVFNLALSDLLMVL-EIPLLVYNSL--KLRPALGVWGCQLYGLMGGLSGTSAIFSIAALSLERYLALGRPRDPFARLTRSRAFALSLSSWIYALCFSAWPLLGV-TS
UV5_anoGam  PPEELVHIPEHWLQFPEPEASLHYLLGLLYIAFTIFSLVGNGLVIWIFIAAKSLRTPSNVFVINLAICDFFMMA-KTPIFIYNSF--TKGFTLGNLGCQIFGFVGSLTGIGAGATNALIAYDRYNTITRPFE--GRLTQTKAIIFICLIWAYTIPWGVLPLLEI-WG
UV5_nasVit  PPEELVHIPEHWLVYPEPNPALHYLLALLYILFTFVALLGNGLVIWIFCAAKSLRTPSNMFVVNLAICDFMMML-KTPIFIYNSF--HTGFALGNLGCQIFSFIGSLSGIGASITNAAIAYDRYSTIARPLD--GKLSRGQVMMLIVLIWMYTIPWALMPSMGV-WG
UV5_apiMel  PAEELIHIPEHWLVYPEPNPSLHYLLALLYILFTFLALLGNGLVIWIFCAAKSLRTPSNMFVVNLAICDFFMMI-KTPIFIYNSF--NTGFALGNLGCQIFAVIGSLTGIGAAITNAAIAYDRYSTIARPLD--GKLSRGQVILFIVLIWTYTIPWALMPVMGV-WG
UV5_diaNig  PAEELIHIPEHWLTYPAPDAFSYYILGMLYVAFCFIALIGNGLVIWVFSSAKTLRTPSNIFVINLALYDFIMML-KTPIFIYNSF--NLGFGLGQLGCQIFAFMGSVSGIGAAATNACIAYDRYRVIARPFD--SKMSIKGATLLVLLVWMWALPWAILPLLEI-WG
UV5_lucCru  PKSELHHIPEHWLVYPEPEASIHYLLGIVYIFICFMGIVGNGLVLWIFSTSKSLKTASNMFVVNLAFCDFIMMM-KMPIFVYNSF--NRGYALGHIGCQIFGFVGSLSGIGAGMTNAFIAYDRYATISNPLE--GKLTRTKALIMIFIIWGYTFPWAVLPMFEV-WC
UV5_triCas  PKDELIHIPQHWLVYPEPEASMHFLLALIYIGFFIMATIGNGLVIWIFSTSKSLRTASNMFVVNLAICDFAMMI-KTPIFIYNSF--YRGFALGHLGCQIFAFIGSLSGIGAGMTNACIAYDRYTTITRPFD--GKITRTKALVMIIFVWGYTIPWAVMPLLEI-WG
UV5_rhoPro  SPEDLKHIPEHWLSYPEPEPILNYALGVLYIFFMLIALIGNGLVIWIFSTAKTLRTPSNIFVVNLAICDFLMMS-KTPIFIYNSF--KLGYALGHRACQIFALLGSFSGIGASATNAVIAYDRYRVIATPFA--PKLSRTKAVLYLALVWAYVTPWALLPLFEQ-WS
UV4_droMel  PPDQIQYIPEHWLTQLEPPASMHYMLGVFYIFLFCASTVGNGMVIWIFSTSKSLRTPSNMFVLNLAVFDLIMCL-KAPIFIYNSF--HRGFALGNTWCQIFASIGSYSGIGAGMTNAAIGYDRYNVITKPMN--RNMTFTKAVIMNIIIWLYCTPWVVLPLTQF-WD
UV3_droMel  PPEELRHIPEHWLTYPEPPESMNYLLGTLYIFFTLMSMLGNGLVIWVFSAAKSLRTPSNILVINLAFCDFMMMV-KTPIFIYNSF--HQGYALGHLGCQIFGIIGSYTGIAAGATNAFIAYDRFNVITRPME--GKMTHGKAIAMIIFIYMYATPWVVACYTET-WG
UV5_manSex  TGDDLAAIPEHWLSYPAPPASAHTALALLYIFFTFAALVGNGMVIFIFSTTKSLRTSSNFLVLNLAILDFIMMA-KAPIFIYNSA--MRGFAVGTVGCQIFALMGAYSGIGAGMTNACIAYDRHSTITRPLD--GRLSEGKVLLMVAFVWIYSTPWALLPLLKI-WG
UV5_papXut  TGEDLAAIPEHWLSYPAPPASAHTMLALVYVFFTAAALIGNGLVIFIFSASKSLRTPSNLLVVQLAVLDFLMML-KAPIFIYNSI--KRGFASGVIGCQIFAFMGSVSGTAAGLTNACIAYDRHSTITRPLD--GRLSRGKVLLMMVCVWLYTAPWAILPQLQI-WG
UV5_acyPis  QAEDLIHIPEHWLKYQEPSSLQHYYLAFMYTIFMFVALFGNGLVIWVFCVAKPLRTPSNIFVINLALCDFVMMA-KAPIFILGSI--NRGYQ-GHFLCQLFGTAGAFSGIGASATNAAIAYDRFSTIAKPFD--GRMTYGRAFFLIICIWTYTLPWGLLPLTEK-WN
UVB_acyPis  TPEDLTHIPEHWLSYPEVRSLYHYILAFSYTILFCLGVIGNGLVLWIFCVSKPLRTPSNLFVLNLALCDFSMVL-VLPILIYDSI--DHKYP-GHLQCQIFALCGSISGIGAGATNAAIAYDRYSTIAKPFE--GRMTYGKALILIICIWIYVLPWCLLPLTEK-WN
UVB_megVic  TPEDLTHIPEHWLSYPEVRSLYHYILAFSYTILFCIGVIGNGLVLWIFCVSKPLRTPSNLFVLNLALCDFSMVL-VLPILIYDSI--DHKYP-GHLQCQIFALCGSISGIGAGATNAAIAYDRYSTIAKPFE--GRMTYGKALILIICIWIYVLPWCLLPLTEK-WN
UV5_dapPul  PEDYMSYVHPYWKTFEAPNPFLLYMIGFLYTIFMFCCVAGNGVVIWIFTNCKSLRTPSNMLVVNLAILDMLMML-KSPVMIINSY--NEGPIWGKLGCDVFGLMGSYNGIGSAVNNAAIAYDRHRTISRPLD--GKLSRKQVTLMIVAIWAWATPFSVMPFLGI-WG
UV5_braKug  PAEYMEFVHPHWKQFEAPNPFLHYMLGVFYIIFMFCSLIGNGVVIWVFASAKSLRTPSNLFVINLAVLDFLMML-KTPVFIVNSF--NEGPIWGKTGCDFFALLGSYAGIGGATTNAAIAFDRYRTIAHPFD--GKLSRGQAITLCMLCWLYATPFSLMPFFGI-WG
UV5_triLon  PKDYMEYVHPHWQTFEAPNPFLHYLLGVLYIGFMFCALVGNGVVIWIFSSAKSLRTPSNMFVINLAVLDFIMMM-KTPVFIVNSF--NEGPIWGKFGCDLFALMGSYSGIGGAMTNAAIAFDRYRTIARPFD--GKLSRGKVLTICAGIWLWATPFSLMPLFGI-WG
UV5_triGra  PKDYMDYVHPHWQTFEAPNPFLHYLLGVLYIGFMFCALVGNGVVIWIFSSAKSLRTPSNMFVINLAVLDFIMMM-KTPVFIVNSF--NEGPIWGKFGCDMFALMGSYSGIGGAMTNAAIAFDRYRTIARPFD--GKLSRGKVLTICAGIWLWATPFSLMPLFGI-WG
UV5_pedHum  DPSELVHIPDHWFNFSAPHPLSNYLLGFLYFIFFVISCTGNGIVIWIFTTSKNLRTASNVFVVNLAIFDFIMMA-KTPIMIYNSM--NLGFECGFVWCQIFASAGALSGIGASITNTCIAYDRCETITNPLQ---KSGKKKAFLLAAFTWIYALPWAVLPFLEI-WG
UVB_anoGam  PPEEQYLVHDHWKGFPSPPYYMHLMLAMIYFVLMNTSLIGNGIVLWIFGTSKSLRNGSNMFIINLAIFDLLMMC-EMPMFLVNSF--SERLVGYGVGCSVYAALGSMSGIGGAISNAVIAFDRYRTISNPLD--GRLSRVQAGLLICLTWLWTMPFTLLPLFEI-WG
UVB_manSex  PEEHQDLVHDHWRNFPAVSKYWHYVLALIYTMLMVTSLTGNGIVIWIFSTSKSLRSASNMFVINLAVFDLMMML-EMPLLIMNSF--YQRLVGYQLGCDVYAVLGSLSGIGGAITNAVIAFDRYKTISSPLD--GRINTVQAGLLIAFTWFWALPFTILPAFRI-WG
UVB_nasVit  PPEHIDIVHPHWRGFLAPGKYWHIGLALIYFMLLVLSFVGNGCVVWIFSTSKVLRTPSNLFIINLALFDLVMAL-EIPMLIINSF--IERMIGWGLGCDIYAALGSVSGIGSAITNAAIAYDRYRTISCPID--GRLNGKQAAVMVAFTWFWTMPFTILPFAKI-WG
UVB_apiMel  PPEYSDLVHPHWRAFPAPGKHFHIGLAIIYSMLLIMSLVGNCCVIWIFSTSKSLRTPSNMFIVSLAIFDIIMAF-EMPMLVISSF--MERMIGWEIGCDVYSVFGSISGMGQAMTNAAIAFDRYRTISCPID--GRLNSKQAAVIIAFTWFWVTPFTVLPLLKV-WG
UVB_diaNig  PAEHIELVHSHWRGYEAPSKYWHYWFAFMYFCIMIMSCLGNGIVLWIFATTKSLRTPSNMFVVNQALLDLLMMI-EMPMFVLNSL-FYQRPIGWEMGCDIYALLGAVSGIGSAINNAAIAYDRYRTISFPLD--GRLQFGHALAFIVGVWSWAMPFSLLPLLKV-WG
UV5B_droMe  PAEYQHMVHAHWRGFREAPIYYHAGFYIAFIVLMLSSIFGNGLVIWIFSTSKSLRTPSNLLILNLAIFDLFMCT-NMPHYLINAT--VGYIVGGDLGCDIYALNGGISGMGASITNAFIAFDRYKTISNPID--GRLSYGQIVLLILFTWLWATPFSVLPLFQI-WG
UV5_plePay  NAAPDIYVPDYWKQFRAPAPYLHYMLGFFYICLMSIAVVGNAIVMYIFFSAKTLRTPTNMFVIGLAMADLLMMS-KTPVFIYNCF--HLGPVFGQIGCDIYGIVGTYSGIGSAFCNAIIAYDRYRVIVHPFSK-SGMSITKAIAFLVIIYLYITPFAILPALKI-WS
UV5_hasAda  NAAPDILVPDYWKQFRAPAPYLHYILGCLYICLMSVALIGNAIVIYIFSVSKSLRTPTNMFVIGLAMADLLMMS-KTPVFIYNCF--HLGPVFGQLGCDIYAIVGTYSGIGSAFCNAVIAYDRYRVIVHPFSK-SGMTMTKAIAILVIVYLYITPFAILPALKI-WS
LWS_anoGam  PPEIMHLVDPHWSQFPPMNPLWHSIIGFVIFVLGVVSIIGNGMVIYIFSTAKSLRTPSNLFIVNLALSDFLMMGTNAFTMVYNCW--FETWSLGLLMCDLYAFFGSLFGCCSIWTMTMIALDRHNVIVHGLSG-KPLTNTGAILRILLCWLIGVVWGILPMLG--WN
LWS_rhoPro  PPEMLSMVDAHWYQFPPLNPLWHGILGFVIGVLGIISIVGNGMVIFIFSSTKTLRTPSNLLVVNLAFSDFLMMFTMSPPMVINCY--NETWVLGPLMCELYGMLGSLFGCASIWTMTMIALDRYNVIVKGISA-KPMTNKTAMLRILLVWAFSIMWTVFPFFG--WN
LWS_schGre  PPEMLYLVDPHWYQFPPMNPLWHGLLGFVIGVLGVISVIGNGMVIYIFSTTKSLRTPSNLLVVNLAFSDFLMMFTMSAPMGINCY--YETWVLGPFMCELYALFGSLFGCGSIWTMTMIALDRYNVIVKGLSA-KPMTNKTAMLRILFIWAFSVAWTIMPLFG--WN
LWS_lucCru  PPDMLHLIDAHWYQYPPLNPLWHAILGFMIGVLGCISVTGNGMVIYIFSTTKSLRSPSNLLVVNLAFSDFLMMFTMAPPMVINCY--NETWVWGPLFCQIYGMLGSLFGCTSIWTMTMIALDRYNVIVKGLSA-KPLTKQGALIRIFLVWVFSIGWTIAPVFG--WN
LWS_triCas  LPDMLHLVDAHWYQFPPMNPLWHGILGFVIGVLGFVSIVGNGMVIYIFSSTKALRTPSNLLVVNLAFSDFLMMLCMSPAMVINCY--NETWVLGPLVCELYGMSGSLFGCASIWTMTFIALDRYNVIVKGLSA-QPLTKKGAMLRILIIWVFSTLWTIAPFFG--WN
LWS_manSex  PPDMMHMIDPHWYQFPPMNPLWHALLGFTIGVLGFVSISGNGMVIYIFMSTKSLKTPSNLLVVNLAFSDFLMMCAMSPAMVVNCY--YETWVWGPFACELYACAGSLFGCASIWTMTMIAFDRYNVIVKGIAA-KPMTSNGALLRILGIWVFSLAWTLLPFFG--WN
LWS_papXut  TPDMMHLIDPHWYQFPPMNPMWHGLLGFTIGVLGFISITGNGMVVYIFTSTKSLKTPSNLLVVNLAFSDFLMMLCMAPPMLINCY--YETWVFGPLACELYACAGSLFGSISIWTMTMIAFDRYNVIVKGIAA-KPMTINGALLRILGIWLFSLAWTIAPMLG--WN
LWSb_apiMe  PPDMLHLIDANWYQYPPLNPMWHGILGFVIGMLGFVSVMGNGMVVYIFLSTKSLRTPSNLFVINLAISDFLMMFCMSPPMVINCY--YETWVLGPLFCQIYAMLGSLFGCGSIWTMTMIAFDRYNVIVKGLSG-KPLSINGALIRIIAIWLFSLGWTIAPMFG--WN
LWS_homCoa  PPEMLYLVDAHWYQFPPMNPLWHSLLGFAMVVLGFIAVTGNGMVVYIFSCTKALRTPSNLLVVNLAFSDFLMMFTMAPPMVLNCY--YETWVLGPFMCELYAMFGSILGCTSIWTMVMIANDRYNVIVKGLSA-KPMTIKSALARILFCWAHSLIWCLAPFLG--WG
LWSa_nasVi  PPEIHHMIDPYWYQFPPMNPLWYGILGFVIGCLGCISVAGNGMVVYIFASTKSLRTPSNLLVINLAFSDFCMMFTMSPPMVINCY--YETWVFGPLMCEIYALCGSIFGCGSIWTMCMIAFDRYNVIVKGLSA-KPMTINGSLLRILGIWLMASIWTIAPMFG--WN
LWS_acyPis  PADMMHLIDPSWYQFPPMESMWYKWLGVTIFFLGILSVVGNGMVIYIFTCTKNLRTPSNLLIVNLAFSDFCLMFTMCPAMVWNCF--YETWMFGPFACELYAMFGSLFGVTSIWTMVFIALDRYNVIVKGLSA-KPMTTKLALLQIFCIYLHGLFWTLTPFFG--WS
LWSb_nasVi  PPEMLHLVGPHWYQFPPLWPIWHKLLGVVMIFIGVLGWCGNGMVVYIFLVTPSLRTPSNLLVINLAFSDFVMMIIMSPPMVVNCW--YETWILGPLMCDIYALIGSLCGGASIWTMTAIAYDRYNVIVKGMSG-TPLTIPRALVQIVLIWTHGLIWAMLPLFG--WN
LWSa_apiMe  PEEMLHLIDLYWYQFPPLDPLWHKILGLVMIILGIMGWCGNGVVVYVFIMTPSLRTPSNLLVVNLAFSDFIMMGFMCPPMVICCF--YETWVLGSLMCDIYAMVGSLCGCASIWTMTAIALDRYNVIVKGMSG-TPLTIKRAMLQILGIWLFGLIWTILPLVG--WN
LWS6_droMe  PAEIMHMVDPYWYQWPPLEPMWFGIIGFVIAILGTMSLAGNFIVMYIFTSSKGLRTPSNMFVVNLAFSDFMMMFTMFPPVVLNGF--YGTWIMGPFLCELYGMFGSLFGCVSIWSMTLIAYDRYCVIVKGMAR-KPLTATAAVLRLMVVWTICGAWALMPLFG--WN
LWS_meoOer  PENMLHMIHSHWYQFPPLNPMWYGILAFVVTVVGLCSICGNFVVIWVFMNTKALRSPANTLVVSLAVSDFIMMACMFPPLVLNCY--WGTWIFGPLFCEVYAFIGNTVGCASIGNMIFITFDRYNVIVKGISG-TPLSQKNTTLQVLFVWICSIMWCVFPFFG--WN
LWS1_droMe  TPDMAHLISPYWNQFPAMDPIWAKILTAYMIMIGMISWCGNGVVIYIFATTKSLRTPANLLVINLAISDFGIMITNTPMMGINLY--FETWVLGPMMCDIYAGLGSAFGCSSIWSMCMISLDRYQVIVKGMAG-RPMTIPLALGKIAYIWFMSSIWCLAPAFG--WS
LWS2_droMe  LPDMAHLVNPYWSRFAPMDPMMSKILGLFTLAIMIISCCGNGVVVYIFGGTKSLRTPANLLVLNLAFSDFCMMASQSPVMIINFY--YETWVLGPLWCDIYAGCGSLFGCVSIWSMCMIAFDRYNVIVKGING-TPMTIKTSIMKILFIWMMAVFWTVMPLIG--WS
LWS_limPol  PKEMLYMIHEHWYAFPPMNPLWYSILGVAMIILGIICVLGNGMVIYLMMTTKSLRTPTNLLVVNLAFSDFCMMAFMMPTMTSNCF--AETWILGPFMCEVYGMAGSLFGCASIWSMVMITLDRYNVIVRGMAA-APLTHKKATLLLLFVWIWSGGWTILPFFG--WS
LWS_ixoSca  PDEMLYMVHPHWYNFKPMNPLWHSLLGFAMVILGVISVVGNSMVIYIMTTSKSLRSPTNMLVVNLAFSDWCMMAFMMPTMAANCF--AETWILGPFMCEVYGMVGSLFGCGSIWSMVMITLDRYNVIVRGVAA-APLTHKRAALMIFFVWFWALTWTLLPFFG--WS
LWS2_plePa  PKEILHMIHDHWYQFPPLNPLWHSLLGIAMILLGIVSVIGNGMVMYLMNTTKSLKTPTNMLIVNLAFSDFCMMAFMMPTMAANCF--AETWILGPFMCEIYGMAGSLFGCVSIWSMVMIAFDRYNVIVRGMNA-EPLTTKKAAAQIFLIWAWAIMWTVLPFFG--WS
LWS2_hasAd  PKEILHMIHDHWYQFAPLNPLWHSLLGIAMIILGIVSVIGNGMVIYLMSTTKSLKTPTNMLIVNLAFSDFCMMAFMMPTMAANCF--AETWILGPLMCEIYGMAGSLFGCVSIWSMVMIAFDRYNVIVRGMSA-EPLTTKKAAAQIFFIWTWATTWTLFPFFG--WS
LWS1_plePa  PEDMLYMIHEHWYKYPPMESTMHYLLGITIILIGIISVSGNSIVIYLMLSVKSLRTPANFLVTSLAVSDGGMLAFMAPTMPINCF--AQTWVLGPFMCELYGMVGSLFGSASIWNMVMITLDRYNVIVRGMSG-KPLTKVGALLRIIFVWVWSLGWTIAPMYG--WS
LWS1_hasAd  PEDMLPMIHEHWYKFPPMETSMHYILGMLIIVIGIISVSGNGVVMYLMMTVKNLRTPGNFLVLNLALSDFGMLFFMMPTMSINCF--AETWVIGPFMCELYGMIGSLFGSASIWSLVMITLDRYNVIVKGMAG-KPLTKVGALLRMLFVWIWSLGWTIAPMYG--WS
BCRa_hemSa  PDRVKHMVLDHWYNYPPVNPMWHYLLGVVYLFLGVISIAGNGLVIYLYMKSQALKTPANMLIVNLALSDLIMLTTNFPPFCYNCF-SGGRWMFSGTYCEIYAALGAITGVCSIWTLCMISFDRYNIICNGFNG-PKLTQGKATFMCGLAWVISVGWSLPPFFG--WG
BCRb_hemSa  RPEIKPYVHQHWYNYPPVNPMWHYLLGVIYLFLGTVSIFGNGLVIYLFNKSAALRTPANILVVNLALSDLIMLTTNVPFFTYNCF-SGGVWMFSPQYCEIYACLGAITGVCSIWLLCMISFDRYNIICNGFNG-PKLTTGKAVVFALISWVIAIGCALPPFFG--WG
BCR_porPel  RPEIKPYVHQHWYNYPPVNPMWHYLLGVIYLCLGFISIIGNGMVIYLFAKCQALRTPANILVVNLALSDLIMLTTNVPFFTYNCF-NGGVWMFSATYCEIYGCLGAITGVTSTWLLCMISFDRYNIICNGFNG-PKLTNGKAIILAFISWAISVGFGIAPLFG--WG
BCR_triGra  PSDMKTMVHSHWNKFPPVNPMWHYLLGMVYIILGTVSIAGNSLVISLFTKTKELRTPANMFVVNLAFSDLCMMITQFPMFVYNCF-NGGMWLFGPFLCELYAATGAVFGLCSICTLACIAFDRYNLIVKGMSG-PKMTSKRATILIAFCWAYAIGWSLPPFFG--WG
BCR2_triLo  PSDMKTMVHSHWSKFPPVNPMWHYLLGLVYIVLGTVSIAGNSLVISLFTKTKELRTPANMFVVNLAFSDLCMMITQFPMFVYNCF-NGGMWLFGPFLCELYAATGAVFGLCSICTLACIAYDRYNLIVKGMSG-PKMTSKRATILIAFCWSYAIGWSLPPFFG--WG
BCRa_dapPu  PDDMKEFIHPHWNKFPPVNPMWHYLLGVIYVILGITSVTGNSLVVHLFAKTRDLRTPANMFVINLAFSDLCMMITQFPMFVFNCF-NGGVWLFGPLFCELYACTGSIFGLCSICTMAAISYDRYNVIVNGMNR-RRMTYGRAGGLILFCWIYAIGWSIPPFVG--WG
BCR_limPol  PENIKHLISDHWSKFPAVNPMWHYLLGLIYIVLGIASLTGQSVVLYLFAKTKPLRTPANMLIVNLAFSDFMMMITQFPVFIINCL-GGGAWQLGPLLCEITGFAGGLFGYGSIVTLAVISIDRYNVIVRGFSA-SPLTHARSAVFILVIWAWTLGWALPPFFG--WG
BCR2_braKu  PADVIAMTHAHWKQFPPSNPAWNYLFGVIYFFLWIVNHIGNGLVIWIFLKTKSLRTPSNMLIVNLAIADFFMMLTQSPLYIISAF-TSRWWIWGHFWCRFYGYTGGITGIAAIFTMVFIGYDRYNVIVKGMNG-TKITKGMAFIMILWTWIYANAFCLPAMLEV-WG
BCR3_braKu  PADIVALTHAHWKKFPPSNPAWNYLFACLYFFLWVINHIGNGLVIKIFLKTKSLRTPSNMLIVNLAIADFFMMLTQSPLFIISAF-SSRWWIWGHFWCRFYGYTGGITGIAAIFTLVFIGYDRYNVIVKGMSG-KRISKGMAFGMIVWTWVYANVFCLPPMLQV-WG
BCR1_triGr  PEDVRAFLHPHWHNFPATHPAIYYLFGLVYLVLGVTSVGGNYLVLRIFTKFQELRRPSNVLVINLALSDMLLMLTLFPECVYN-FLSGGPWRFGDLGCQIHAFCGALFGYNQITTLVFISYDRFNVIVRGMGG-TPLTYARVSAMVAFSWLWATGWSVAPLVG--WG
BCR2_triGr  PLDMHHLLHSHWDAYPPADPRIHYLLGMLYFFLGIAACMGNVLVLHIFGKHKNLRSPTNTLLMNLAFCDLMIFIGLYPEMLGNIFMNDGTWMWGDVACRIHAWFGLVFGFGQMQTLMYMSIDRYNVIVKGLSA-QPLTYKKVTQWLAQVWIVSLFWGTAPFFG--FG
BCR1_triLo  PLDMHHLLHSHWDSYPPADPRIHYLLGMLYFFLGIAACVGNVLVLHIFGKHKNLRSPTNTLLMNLAFCDLMIFIGLYPEMLGNIFMNDGTWMWGDIACRLHAWFGLVFGFGQMQTLMYMSIDRYNVIVKGLSA-QPLTYKKVTQWLAQVWIVSLFWGTAPFFG--FG
BCR3_triGr  PENVRYMVHLHWEKFPPPDPRVHTALGALYLIMGVMSAVGNVLVLYIFGKYKSLRSPTNVLVMNLAFCDLGLFVGLYPELLGNIFINNGPWMWGDVACKIHAWCGLAFGFGQMQTLMFVSMDRYYVIVKGLKA-PPLTYWKVSVWLAMVWIVSIFWATSPFFG--FG
 Consensus  p......!..hW..%ppp.p..hy.lg..y......s..GNg.Vi.if...ksLRtpsN.l!.NLA..Df.$m....P....N.. .......g...C.i%a..G.l.G..si.t...Ia.DR%nv!v.p......lt...a...i...W.....w...P.....w.
           
Special res .......................................................<---------HEK region-------->...................................................K............N.PRYR..........
UV7 diagnos .......................................................<----16aa HEK region del---->.............................................................................p.l
UV  diagnos .............................................................................................................................................D.........RF...........
Location    E2E2E2E2E2E2E2E2E2E2E2E2EM5M5M5M5M5M5M5M5M5M5M5M5M5M5M5C3C3C3C3C3C3C3C3C3C3C3C3C3C3C3C3C3CM6M6M6M6M6M6M6M6M6M6M6E3E3E3E3E3E3EM7M7M7M7M7M7M7M7M7M7M7CTCTCTCTCTCTCTCTC
                                                                                                                                                              
UV7_aedAeg  RYTPEGFLTACSFDYLERT-RDARLFMFLYFIFAWVVPIIAITFCYIQILRVVIGAN---------SIQSSKNKSKT-------EVKLAGVVIGIIGLWFIAWTPYAIVAMMGVFGYESL--LSPLGSMVPAILAKTAACIDPYFYAMNHPRYRQELRKMFGLN
UV7_culQui  RYTPEGFLTACSFDYLDRG-WDARVFMFMYFVFAWVIPFLTISYCYVAILRVVVGAG---------SIQSSKNKNKQ-------EVKLAGVVIGIIGLWFIAWTPYAVVAMLGVFGYEHL--LTPLGSMIPAILAKTASCIDPYFYAMNHPRFRQELRKMFGKE
UV7_anoGam  RYTAEGYLTACSFDYLDRT-YKARVFMFVYFVFAWLIPFAIISYCYARILIAVINAN---------AIQSSKSKNKT-------EVKLAGVVVGIIGLWFAAWTPYAVVAMMGVFGYEQY--LTPLNSMIPAVFAKIAASIDPYFYAMNHPRYRQMLERMFCNR
UV7_droMel  VYVPEGFLTTCSFDYLNKE-MPARIFMALFFVAAYCIPLTSIVYSYFYILKVVFTAS---------RIQSNKDKAKT-------EQKLAFIVAAIIGLWFLAWSPYAIVAMMGVFGLERH--ITPLGSMIPALFCKTAACVDPYLYAATHPRFRVEVRMLFYGR
UV7_droYak  VYVPEGFLTTCSFDYLNKE-TPARIFMALFFVAAYCIPLTSIVYSYFYILKVVFTAS---------RIQSNKDKAKT-------EQKLAFIVAAIIGLWFLAWSPYAIVAMMGVFGLERH--ITPLGSMIPALFCKTAACVDPYLYAATHPRFRVEVRMLFYGR
UV7_droAna  VYVPEGFLTTCSFDYLNKE-TPARIFMALFFVAAYCFPLTAIVYSYFYILKVVFSAG---------RIQSNKDKAKT-------EQKLAFIVAAIIGLWFLAWSPYAVVAMMGVFGLEKH--ITPLGSMIPALFCKTAACVDPYLYAATHPRFRVEVRMLFYGR
UV7_droPse  VYVPEGFLTTCSFDYLNKE-TPARIFMALFFVAAYCVPLTTIVYSYFYILKVVFTAS---------RIQSNKDKAKT-------EQKLAFIVAAIIGLWFLAWSPYAIVAMMGVFGQERH--ITPLGSMIPALFCKTAACVDPYLYAATHPRFRVEVRMLFFGR
UV7_droWil  VYVPEGYLTTCSFDYLNKE-TPARIFMALFFVAAYCVPLTCIMFSYFYILKVVFTAN---------RIQSNKDKAKT-------EQKLTFIVAAIIGLWFLAWSPYAVVAMMGVFGLEQH--ITPLGSMIPALFCKTAACVDPYLYAATHPRFRVEVRMLIYGR
UV7_droMoj  VYVPEGFLTTCSFDYLNKE-TPARIFMALFFVAAYCIPLASIVYSYFYILKVVFTAN---------RIQSSKDKAKT-------EQKLTFIVAAIIGLWFIAWSPYAIVAMMGVFGQEQH--ITPLGSMIPALFCKTAACVDPYLYAATHPRFRVEVRMLMYGR
UV7a_acyPi  RYVSEGYLTSCSFDYLSDN-DQDKRFILVFFTAAWCIPFTIILYCYVNILMAVWMTT----EIVTSRVGQQEEKRKT-------DIRLGYMVIGALALWFVSWTPYAVVALLGVFDLKEY--ISPLSSMIPALFCKAASCTDPWFYAITHPRFKKELMKLLTKS
UV7b_acyPi  RYVPEGYLTSCSFDYLSDD-NQEKGFILVFFTAAWCIPFTTISYCYIKILRAVWMTS----EMAASRFGQEEEKRKT-------EIRLGYVVVGVIMLWFVSWTPYAMVALLGVFDRKDY--ITPLSSMIPAVLCKAASCMDPWIYAITHPRFKNELTKLMSRK
UV7_rhoPro  TYVPEGFLTSCSFDYLSTD-IQNRCFIFIYFVAAWCLPLLVIITSYVGICREVLRVS----LI---RKGQEREQRKR-------EAKLSAILALATFLWFLSWTPYAAVALLGIFGYKNH--ITQLASMIPALFCKTAACVNPFIYGLNHPRLRQQLLKLCCKK
UV7_pedHum  KYVPEGYLTSCSFDYLTQD-TASRLTIFVFFVAAWIVPLSIILGSYMALYKVVLKARGTHFNTVMTRHCKDIEIQRP-------ELKAAVTVICIVCLWTLSWTPYAVVALLGITGNEKY--ISPMSSMIPALFCKTASCIDPFVYAATNRRFRNELKRKYRKR
UV7_ixoSca  PYVPEGFLTSCSFHFLSDA-TSDRCFVWIFFVAAWCVPLVFVTTCYSGILVTVIRS----------RKALAQESRRS-------ELRVAKVSLALVLLWTVAWTPYAIVALLGITGRRNL--LTPWGSMAPAMFCKSAAVLDPFVYGLSHPSFRRELAIMLPCL
UV5_anoGam  RYVPEGFLTSCTFDYLSGT-FDTRLFVASIFTFSYVLPMSLIIYYYSQIVSHVVNHEKSLREQAKKMNVESLRSNQNQK-DASVEIRIAKAAITVCFLFVASWTPYAVLALIGAFGDKSL--LTPGVTMFPACACKFVACLDPYVYAISHPRYRIELQKRLPWL
UV5_nasVit  RFVPEGFLTSCTFDYITDS-DEIRYFVGTIFTFSYAIPMTLIIYFYSQIVGHVVNHEKALREQAKKMNVESLRSGQNKD-QASAEVRIAKVALTICFLFVAAWTPYGVMSLIGAFGNKSL--LTPGVTMIPACCCKAVACLDPYVYAISHPRYRLELQKRMPWL
UV5_apiMel  RFVPEGFLTSCSFDYLTDT-NEIRIFVATIFTFSYCIPMILIIYYYSQIVSHVVNHEKALREQAKKMNVDSLRSNANTS-SQSAEIRIAKAAITICFLYVLSWTPYGVMSMIGAFGNKAL--LTPGVTMIPACTCKAVACLDPYVYAISHPKYRLELQKRLPWL
UV5_diaNig  RYAPEGYLTSCSFDYLTDT-PENHMFVLCIFICSYVIPMSLIIYFYSQIVSHVVNHEKALKEQAKKMNVDSLRSNQQQN-QTSAEIRIAKVAIGICFLFVASWTPYAVLALIGAFGNKAL--LTPGVTMIPACTCKAVACLDPYVYAISHPRYRAELQKRLPWL
UV5_lucCru  RFVPEGFLTSCTFDYLTDT-FDNDMFVAVIFICSYVIPMSMIIYFYSQIVKHVMHHEKALRDQAKKMNVESLRSNQSLQ-SQSIEIKIAKVAIMVCFLFVASWTPYAVLALIGGFGDQSL--LTPGVTMVPALACKFVACLDPYVYALSHPRYRMELQKRLPWL
UV5_triCas  RFAPEGFLTACSFDYLTDT-FDNHMFVTSIFICSYVIPMSMIIYFYSQIVSKVFSHEKALREQAKKMNVESLRSNQSQQASQSAELRIAKAAIAICSLFVASWTPYAVLALIGAFGDQSL--LTPGVTMVPACACKFVACLDPYVYAISHPKYRLELQKRLPWL
UV5_rhoPro  RFVPEGFLTSCTFDYLTPT-SEIRNFVTVMFFICYVFPMSLIIYFYSQIVSHVIIHEHNLREQAKKMNVESLRSNANMH-TQSAEIRIAKAAITICFLFVASWTPYAVLALIGAYGNQDL--LTPAVTMIPACACKAVACVDPYVYAISHPRYRQELSKKFPWL
UV4_droMel  RFVPEGYLTSCSFDYLSDN-FDTRLFVGTIFFFSFVCPTLMILYYYSQIVGHVFSHEKALREQAKKMNVESLRSNVDKS-KETAEIRIAKAAITICFLFFVSWTPYGVMSLIGAFGDKSL--LTPGATMIPACTCKLVACIDPFVYAISHPRYRLELQKRCPWL
UV3_droMel  RFVPEGYLTSCTFDYLTDN-FDTRLFVACIFFFSFVCPTTMITYYYSQIVGHVFSHEKALRDQAKKMNVESLRSNVDKN-KETAEIRIAKAAITICFLFFCSWTPYGVMSLIGAFGDKTL--LTPGATMIPACACKMVACIDPFVYAISHPRYRMELQKRCPWL
UV5_manSex  RYVPEGYLTSCSFDYLTNT-FDTKLFVACIFTCSYVFPMSLIIYFYSGIVKQVFAHEAALREQAKKMNVESLRANQGGS-SESAEIRIAKAALTVCFLFVASWTPYGVMALIGAFGNQQL--LTPGVTMIPAVACKAVACISPWVYAIRHPMYRQELQRRMPWL
UV5_papXut  RYVPEGFLTSCTFDYLTTT-FDNKLFVASMFVCVYIFPMIAILYFYSGIVKQVFAHEAALREQAKKMNVDSLRSNQNAA-AESAEIRIAKAALTVCFLYVASWTPYGVMSLIGAFGDQNL--LTPGVTMIPALACKGVACIDPWVYAISHPKYRQELQKRMPWL
UV5_acyPis  RYVPEGYLTSCTFDYLSPT-DETRAFVGIMFVICYVIPVSLVIFFYSQIVSHVFNHEKALREQAKKMNVESLRSNQDAN-AQSAEVRIAKAAITICCLFIASWTPYAVVAMIGAFGDRSL--LTPGITMIPAIFCKTVACFDPYVYAISHPRYRLELSKRVPCL
UVB_acyPis  RFVPEGFLTSCSFDYLTPT-EETKAFVGTMFVICYVIPMSFIIYFYSQIVCHVFNHEKALREQAKKMNVESLRSNQDAN-AQSAEVRIAKAAITICFLFVAAWTPYAVVAMIGAFGDQSL--LTPIASMLPAVFAKTVACFDPYVYAISHPKYRLELSKRVPCL
UVB_megVic  RFVPEGFLTSCSFDYLTPT-EETKAFVGTMFVICYVIPMSFIIYFYSQIVCHVFNHEKALREQAKKMNVESLRSNQDAN-AQSAEVRIAKAAITICFLFVAAWTPYAVVAMIGAFGDQSL--LTPIASMLPAVFAKTVACFDPYVYAISHPKYRLELSKRVPCL
UV5_dapPul  RYVPEGFLTTCTFDYMTED-ASTRFFVGSIFVYSYVIPLAMLIFYYSKIVRSVGDHEKTLRDQAKKMNVTSLRSNRDQN-EKSAEVRIAKVAIALATLFVFAWTPYAFVALTAAFGNRSV--LTPLLSTVPACCCKLVSCINPWIYAINHPRYRMELQKKMPWF
UV5_braKug  RFVPEGFLTTCSFDYITED-SSTRAFVGTIFFTSYVLPMILIIYFYSQIVGHVRQHEETLRAQAKKMNVATLRSGKDDQ-EQSAEVRIAKVCIGLFSMFVISWTPYAAVALLCAFGNRAA--VTPLVSMIPALTCKAVACIDPWIYAINHPRYRLELQKRLPWF
UV5_triLon  RFVPEGFLTTCSFDYMTET-SSIRWFVGCIFTYSYIIPLGLIIYYYSKIVGHVQEHERILREQARKMNVESLRSGKDQQ-EKSAEIRIAKVAIGLSLMFVVAWTPYALVALIAAFGNRAV--LTPLVSMIPACCCKAVACIDPWIYAINHPRYRLELQKRMPWF
UV5_triGra  RFVPEGFLTTCSFDYMTET-SSIRWFVGCVFTYSYIIPLGLIVYYYSKIVGHVQEHERILREQARKMNVESLRSGRDHQ-EKSAEIRIAKVAIGLSLMFVVAWTPYALVALIAAFGNRAV--LTPLVSMIPACCCKAVACIDPWIYAINHPRYRLELQKRMPWF
UV5_pedHum  KFAPEGYLTTCTVDYLTDT-SQTRMFIVTIFFAAYVLPLSLIIYFYTKIVLHVINHEKSLKAQAKKMNVESLRSDGNKN--YAVEIRITKVAIAMCFLFVISWTPYAVVALIGCFGNKHL--ITPLVSMIPACACKAVACIDPYIYAISHPRFRVEVNKRFACL
UVB_anoGam  RYIPEGYLTTCSFDYLTDD-PDTRVFVGCIFTWAYVIPMIFICYFYARLFGHVRQHEMMLKNQARKMNVESLTANRSEK-AQAVEMRIAKAAFTIFFLFVCAWTPYAIVTMIGAFGDRTM--LTPFVTMVPAVCCKIVSCLDPWVYAISHPKYRQELERRLPWM
UVB_manSex  RFVPEGFLTTCSFDYFTED-QDTEVFVACIFVWSYCIPMALICYFYSQLFGAVRLHERMLQEQAKKMNVKSLASNKEDN-SRSVEIRIAKVAFTIFFLFICAWTPYAFVTMTGAFGDRTL--LTPIATMIPAVCCKVVSCIDPWVYAINHPRYRAELQKRLPWM
UVB_nasVit  RYTTEGFLTTCSFDFLSDD-QDTKVFVAAIFSWSYCFPMVLIIYFYSQLIKSVRRHEKMLREQAKKMNVKSL-SAQ-DK-ERSVEMRIAKVAFTIFFLFVCSWTPYAVVTMIAAFGNREL--VTPFSSMLPAVFAKTVSCIDPWVYAINHPRYRQELTKRCQWM
UVB_apiMel  RYTTEGFLTTCSFDFLTDD-EDTKVFVTCIFIWAYVIPLIFIILFYSRLLSSIRNHEKMLREQAKKMNVKSLVSNQ-DK-ERSAEVRIAKVAFTIFFLFLLAWTPYATVALIGVYGNREL--LTPVSTMLPAVFAKTVSCIDPWIYAINHPRYRQELQKRCKWM
UVB_diaNig  RYVPEGLLTTCSFDYLTDD-EDTKVFTASIFTWSYAFPLCLIVFFYCKLFKQVRLHEKMLQEQARKMNVKSLQTNQDVA-QKSVEIRIAKVAFTIFFLFLCSWTPYATVAMIGAFGNRAL--LTPMSTMIPALFSKIVSCIDPWIYAINHPRFRGELLKRAPWF
UV5B_droMe  RYQPEGFLTTCSFDYLTNT-DENRLFVRTIFVWSYVIPMTMILVSYYKLFTHVRVHEKMLAEQAKKMNVKSLSANANAD-NMSVELRIAKAALIIYMLFILAWTPYSVVALIGCFGEQQL--ITPFVSMLPCLACKSVSCLDPWVYATSHPKYRLELERRLPWL
UV5_plePay  RYVPEGFLTSCSADFFMQD-FNGRSYIVGTWFFGWFIPVAAIVFFYVQIFLAVKDHEEKIKEQARKMNVDSIRSNEAVK-NSSAEVRIAKTAMCVFLMFLSSWAPYILVAFITGFSDPKLKRITPVISMVPAMTIKASACFDPFFYALSHPRYRLELQNRMPWL
UV5_hasAda  RFVPEGFLTSCSSDFYMQD-FNGRSYIVGTWFFGWFIPVAAIIFFYAQIFLAVKDHEEKIKEQARKMNVDSFRSNEALK-NSSAEVRIAKTAMCVVLLFLTSWVPYILVAFIAGFSDPKLKRVTPVISMIPAMTIKGSACFDPFFYALSHPRYRLELQNKLPWL
LWS_anoGam  RYVPEGNMTACGTDYLTDD-WFHKSYILVYSVFVYYTPLFTIIYAYFFIIKAVSAHEKNMREQAKRMNVQSLRSSDDGK---STEMKLAKVALVTISLWFMAWTPYTVINYTGVF--KTAS-ITPLATIWGSVFAKANAVYNPIVYGISHPKYRAALLRRFPSL
LWS_rhoPro  RYVPEGNMTACGTDYLTKN-WVSRSYILVYSVFVYFLPLFTIIYSYFFILQAVSAHEKQMREQAKKMNVASLRSAEAANT--SAEAKLAKVALMTISLWFMAWTPYLVINYSGIF--ETIS-ISPLFTIWGSLFAKANAVYNPIVYAIRHPKYKQALEKKFPSL
LWS_schGre  RYVPEGNMTACGTDYLTKD-WVSRSYILVYSFFVYLLPLGTIIYSYFFILQAVSAHEKQMREQRKKMNVASLRSAEASQT--SAECKLAKVALMTISLWFFGWTPYLIINFTGIF--ETMK-ISPLLTIWGSLFAKANAVFNPIVYGISHPKYRAALEKKFPSL
LWS_lucCru  RYVPEGNMTACGTDYLSTG-WFSRSYILFYSWFVYFIPLFAIIYSYWFIVQAVSAHEKAMREQAKKMNVASLRSSEAAQT--SAECKLAKVALMTISLWFLAWTPYLVTNYAGIF--DGSK-ISPLATIWSSLFAKANAVYNPIVYGISHPKYRQALQKKFPSL
LWS_triCas  RYVPEGNMTACGTDYLTKD-WVSRSYILVYAVWVYFVPLFTIIYSYWFIVQAVAAHEKSMREQAKKMNVASLRSSEAAQT--SAECKLAKIALMTITLWFFAWTPYLVTNFTGIF--EGAK-ISPLATIWCSLFAKANAVYNPIVYGISHPKYRQALQKKFPSL
LWS_manSex  RYVPEGNMTACGTDYLSKS-WVSRSYILIYSVFVYFLPLLLIIYSYFFIVQAVAAHEKAMREQAKKMNVASLRSSEAANT--SAECKLAKVALMTISLWFMAWTPYLVINYTGVF--ESAP-ISPLATIWGSLFAKANAVYNPIVYGISHPKYQAALYAKFPSL
LWS_papXut  RYVPEGNMTACGTDYLSKS-WLSRSYILVYSIFVYYTPLLLIIYSYFFIVQAVAAHEKAMREQAKKMNVASLRSSEAANT--SAECKLAKVALMTISLWFMAWTPYLVINYTGVF--ETAP-ISPLATIWGSVFAKANAVYNPIVYGISHPKYRAALYQKFPSL
LWSb_apiMe  RYVPEGNMTACGTDYFNRG-LLSASYLVCYGIWVYFVPLFLIIYSYWFIIQAVAAHEKNMREQAKKMNVASLRSSENQNT--SAECKLAKVALMTISLWFMAWTPYLVINFSGIF--NLVK-ISPLFTIWGSLFAKANAVYNPIVYGISHPKYRAALFAKFPSL
LWS_homCoa  RYVPEGNMTACGTDYLTPD-WISKSYILVYSLFCYFMPLFLIIYSYWFIVQAVSAHEKAMREQAKKMNVASLRSSDAANT--SAEHKLAKVALMTISLWFCAWTPYLVINYAGIF--QALT-ISPLFTIWGSVFAKANACYNPIVYAISHPKYRAALNKKFPSL
LWSa_nasVi  RYVPEGNLTACGTDYFSKD-WVSRSYIVVYSFFVYFLPLFMIIYSYYFIIKAVSAHEKNMREQAKKMNVASLRQGDSQ----SAENKLAKIALMTISLWFMAWTPYLVINWAGIF--DLAR-LTPLFTIWGSVFAKANAVYNPIVYGISHPKYRAALFARFPSL
LWS_acyPis  RYVPEANMTACGTDYLTLA-WHSRSYVLVYAIFAYYLPLLVIIYAYYFIVKAVASHEKSMREQAKKMNVSSLRSGDQSNT--SAEFKLAKVALMTISLWFMAWTPYMVINFAGIF--QLMT-IDPLFTIWGSVFAKANAVYNPIVYAISHPKYRLALDKKFPCL
LWSb_nasVi  RYVPEGNMTSCGTDYVSDD-WLGKSYILVYSIFVYYTPLFSIILCYWHIVSAVAAHERGMREQAKKMNVASLRSGDQSGE--SAEVKLAKVAVTTISLWFLAWTPYLVTNYMGIF--AKQH-VSPLFTIWASLFAKTNACYNPIVYGISHPKYRAGLKVKCPCL
LWSa_apiMe  RYVPEGNMTACGTDYLSQD-WTFKSYILVYSFFVYYTPLFTIIYSYYFIVSAVAAHEKAMKEQAKKMNVTSLRSGDNQNT--SAEAKLAKVALTTISLWFMAWTPYLVINYIGIF--NRSL-ITPLFTIWGSLFAKANAIYNPIVYGISHPKYRAALKEKLPFL
LWS6_droMe  RYVPEGNMTACGTDYFAKD-WWNRSYIIVYSLWVYLTPLLTIIFSYWHIMKAVAAHEKAMREQAKKMNVASLRNSEADKSK-AIEIKLAKVALTTISLWFFAWTPYTIINYAGIF--ESMH-LSPLSTICGSVFAKANAVCNPIVYGLSHPKYKQVLREKMPCL
LWS_meoOer  RYVPRGDMTACGTDYLTED-EFSRSYLYVYSVWVYIGPLALIIYCYFHIVSAVATHEKQMRDQAKKMGVKSLRTEEAKKT--SAECRLAKVALTTVSLWFMAWTPYLIINWAGMF--YPSV-VSPLFSIWGSVFAKANAVYNPIVYAISHPKYRAALYKKLPCL
LWS1_droMe  RYVPEGNLTSCGIDYLERD-WNPRSYLIFYSIFVYYIPLFLICYSYWFIIAAVSAHEKAMREQAKKMNVKSLRSSEDAEK--SAEGKLAKVALVTITLWFMAWTPYLVINCMGLF--KFEG-LTPLNTIWGACFAKSAACYNPIVYGISHPKYRLALKEKCPCC
LWS2_droMe  AYVPEGNLTACSIDYMTRM-WNPRSYLITYSLFVYYTPLFLICYSYWFIIAAVAAHEKAMREQAKKMNVKSLRSSEDCDK--SAEGKLAKVALTTISLWFMAWTPYLVICYFGLF--KIDG-LTPLTTIWGATFAKTSAVYNPIVYGISHPKYRIVLKEKCPMC
LWS_limPol  RYVPEGNLTSCTVDYLTKD-WSSASYVVIYGLAVYFLPLITMIYCYFFIVHAVAEHEKQLREQAKKMNVASLRANADQQKQ-SAECRLAKVAMMTVGLWFMAWTPYLIISWAGVFS-SGTR-LTPLATIWGSVFAKANSCYNPIVYGISHPRYKAALYQRFPSL
LWS_ixoSca  RYVPEGNMTSCTIDYLTKA-LWSASYVVAYAGGVYWTPLFINIYCYSKIVRAVAQHEKQLRLQARKMNVASLRANAEQTKT-SAEARLAKIALMTVGLWFMAWTPYLTIAWAGIFS-DGSK-LTPLATIWGSVFAKANACYNPIVYGISHPKYRAALARRFPSL
LWS2_plePa  RYVPEGNMTSCTVDYLSED-LKSSSYVLIYGCAVYFIPLFTLIYNYTFIVRAVSIHEDNLREQAKKMNVTSLRANADQQKQ-SAECRLAKIALMTVGLWFIAWTPYLCIAWSGIFS-SRKH-LTPLATIWGAVFAKAVAVYNPIVYGISHPKYRAALFQKFPSL
LWS2_hasAd  RYVPEGNMTSCTVDYLTED-LKSSSYVLIYGCAVYFTPLFTLIYNYTFIVRSVSIHENNLREQAKKMNVSSLRANADQQKQ-SAECRLAKIALMTVGLWFIAWTPYLSIAWSGIFS-SRKH-LTPLATIWGAVFAKAVAVYNPIVYGISHPKYRAALFEKFPSL
LWS1_plePa  SYAPEGSMTGCTVDYLHTD-ISTMSYLIVYAIFVYFVPLFIIIYCYTYIVMQVAAHEKSLREQAKKMNIKSLRSNEDNKKA-SAEFRLAKVALMTICLWFMAWTPYLILSLLGIFS-DREW-LTPLTSIWGAVFAKAASAYNPIVYGISHPKYRAALHEKFPCL
LWS1_hasAd  RYVPEGSMTSCTIDYIDTA-INPMSYLIAYAIFVYFVPLFIIIYCYAFIVMQVAAHEKSLREQAKKMNIKSLRSNEDNKKA-SAEFRLAKVAFMTICCWFMAWTPYLTLSFLGIFS-DRTW-LTPMTSVWGAIFAKASACYNPIVYGISHPKYRAALHDKFPCL
BCRa_hemSa  SYTLEGILDSCSYDYFTRD-MNTITYNICIFIFDFFLPASVIVFSYVFIVKAIFAHEAAMRAQAKKMNVTNLRSN-EAETQ-RAEIRIAKTALVNVSLWFICWTPYAAITIQGLL-GNAEG-ITPLLTTLPALLAKSCSCYNPFVYAISHPKFRLAITQHLPWF
BCRb_hemSa  NYILEGILDSCSYDYLTQD-FNTFSYNIFIFVFDYFLPAAIIVFSYVFIVKAIFAHEAAMRAQAKKMNVSTLRSN-EADAQ-RAEIRIAKTALVNVSLWFICWTPYALISLKGVM-GDTSG-ITPLVSTLPALLAKSCSCYNPFVYAISHPKYRLAITQHLPWF
BCR_porPel  KYILEGILTSCSYDYLTQD-FNTRSYNIIIFVFDYFLPAAIIIFSYVFIVKAIFAHEAAMRAQAKKMNVTNLRSG-EAESQ-RAEIRIARTALVNVSLWFICWTPYALISLQGVL-GDLSG-INLLVTTLPALLARSCSWYNPFVYAISHPKYRLAITQHLPWF
BCR_triGra  RYIPEGILDSCSFDYLTRD-SSTKSFGLCLFFFDYVTPLSIIVFAYFHIVRAIFEHEKILREQAKKMNVTSLRSNADQNAQ-SAEIRIAKVALINISLWVAMWTPYATIVLQGLL-GNQEN-ITPLVSILPALIAKSASIYNPVIYAISHPRYRVALQQKLPWF
BCR2_triLo  RYIPEGILDSCSFDYLTRD-SSTKSFGLCLFFFDYITPLSIIVFAYFHIVRAIFEHEKILREQAKKMNVTSLRSNADQNAQ-SAEIRIAKVALINISLWVAMWTPYATIVLQGLL-GNQEN-ITPLVSILPALIAKSASIYNPVIYAISHPRYRIALQQKLPWF
BCRa_dapPu  KYIPEGILDSCSFDYLTRD-TMTISFTCCLFAFDYCVPLIIIIFCYYHIVRAIVHHEDALRDQAKKMNVSSLRSNADQKSQ-SAEIRVAKIAMMNITLWVAAWTPYAAICLQGAV-GNQDK-ITPLVTILPALIAKSASIFNPVVYAISHPKYRLALQKALPWF
BCR_limPol  RYVPEGILNSCSFDYLTRD-WATVSYIMGCWICEYALPLMVIIYCYIFIVKAVCDHERHLREQAKKMNVASLRSNVDTQKA-SAEMRIAKVALVNVLLWVVSWTPYAAIAMIGIA-GDQML-ITPLRSALPALAGKAASVYNPIVYAISHPKFRLAMQKEIPCC
BCR2_braKu  NFSPEGLLSTCSFDYLNDNKFHGYFYTMYIFTGAYCVPMLLLMFFYSQIVKAVWAHEASSRAQAKKMNVESLRSNADANAE-SAEMRIAKVALTNVLLWVCIWTPYAFVAVTGAF-GNRQI-LTPLVAQLPSLICKMASCLNPLVYAISHPKYRQVLQKELPWF
BCR3_braKu  DFSPEGMLSTCSFDYLNENRLHGPIFTGYIFFGAYCVPMFLLFFFYSQIVKAVWAHEAALKAQAKKMNVESLRSNADANAE-SAEVRIAKVALTNVLLWICIWTPYAFVAVTGAF-GNRQI-LTPLVAQLPSLICKCASSLNPIVYAISHPKFRQVIQKDYPWF
BCR1_triGr  GYALDGMLGTCSFDYVTRT-WNNRSHILAATAFMWVIPVLIIAGCYWFIVQAVFKHEAELKAQAKKMNVASLRSNADQQQV-SAEIRIAKVAITNVVLWLSAWTPFMVISNLGIWADPQQV--TPLVSSLPVLLSKTSCSYNPLVYAISHPKYRECLKTLVPWI
BCR2_triGr  NFALDGILNTCSFDYFSRD-MLSMSYIVSACVWAYVIPLIVIIFCYTFIVRAVFEHEETLRQQAAKMNVTSLRSSANSEDT-SAEFRIAKIAMINVCLWLWAWSPFTIVSFIGIF-GNQAI-ITPYLSSLPVILAKTSSVYNPIVYALSHPRYQAALKEEFAWL
BCR1_triLo  NFALDGILNTCSFDYFTRD-MPAMSYIVGACVSAYVIPLIVIIVCYTFIVRAVFEHEETLRQQAAKMNVTSLRSSASAEDT-SAEFRIAKIAMINVCLWLWAWSPFTIVSFIGIF-GNQAI-ITPYLSSLPVILAKTSSVYNPIVYALSHPKYQAALKEEFAWL
BCR3_triGr  NLSVDGLLNTCSYDYYTRD-LPTVAYIVGSCVHAYVLPLAVIIFCYSYIVQAVFHHERQLREQAAKMNVASLRSSGGKQDEMSAEFRIAKIALINCCLWLWAWTPFTVISFMGVLHDDQSI-INPYVSSLPVLLAKTSAVYNPIVYGLSHPKFQQCLREEFGWN
 Consensus  r%vpEG.$t.CsfDYlt.. ...r.%....f...y..Pl..!iy.Y..iv.aV..he..lreqakkmnv.slrs..........E.riakva.....Lw..aWtPYav.a..G.f...... .tPl.sm.pa.f.K..ac.#P.vYaisHP.%r.el....p.l