Sequence similarity methods, April 2000 This is designed to provide an automatically updated BLAST protein similarity comparison of species genes using current reference genes and sequence data sets. See the perl script for the gene & sequence data extraction and blast methods, and blastout.pl for result extraction. For these organisms: man, mouse the reference protein is taken as protein in the current NCBI Refseq data set. For these organisms: fly, weed, worm, yeast the reference protein is taken as the CDS translation in the current NCBI genbank/genomes/ data set. In the case of multiple CDS per gene, the first CDS feature is used. Current gene symbols are matched to those in the sequence feature data using symbols and synonyms of the reference organism databases. MEOW ids are assigned on the basis of this symbol/synonym match, except in the case of fly sequences, where the FlyBase ID is available. On extracting protein sequences for each organism, a fasta set is created with MEOW ID, gene name, sequence accession in header. Each such fasta set is available in the MEOW section for each organism, e.g., . These are concatenated into a blast-able fasta set, now some 50,000 sequences, and an all-way blastp comparison is done (NCBI BLASTP 2.0, blastall), with a cutoff similarity of 1E-30. Results are extracted for each gene/protein comparison, with an attempt to list at least one similarity for each species, and listing more when lower scores fall within 4% of the best score. These results are listed as a table in each species page, with meow IDs, gene symbol, blast statistics, the 'hgtable' file, and are added to the gene report data. E.g., .