Multiple Sequence Alignment Program for Protein


-------------------------------------------------------------------
We have three algorithms. We must choose one for our use.

 dist_amino [-f matrix] < data > dist

then

(1) align_amino 1 [-a,-s,-n] [-f matrix] < dist > result
(2) align_amino 2 [-a,-s,-n] [-f matrix] < dist > result
(3) align_amino 3 [-a,-s,-n] [-f matrix] < dist > result

(1)Tree-based Method
(2)Tree-based Round-robin Method
(3)Tree-based Round-robin Iterative Method

(1) aligns roughly but very fast.
(2) aligns well but needs times a little.
(3) aligns very well but needs times.

(3) is the best algolithm.
If your data is small or if you have enough time,please use it.

If your data is too big or if you don't have enough time,
please use (2).


-----------------------------------------------------------------------------
You can change a way of output by option [-a,-s,-n].

-a : All result  (Tree-based, round-robin and iterative result)
-s : Some result (Tree-based and round-robin result)
-n : Only final result


-----------------------------------------------------------------------------
You can also change a matrix by option [-f matrix].
The format of a matrix file is shown as a example matrix file 'matrix_example'.
Plaese see a example matrix file 'matrix_example'.
We use 'matrix_amino.c' (PAM250) as a default matrix.


------ input data format -----------------------------------------
CutM=80
CutI=90
U,V,S=7,1,1

Seq=
CSRC (HUMAN):KLGQGCFGEVWMGTWNGTTRVAIKTLKPGTMSPEAFLQEAQVMKKLRHEKLV
CABL (HUMAN):KLGGGQYGEVYEGVWKKYSLTVAVKTLKEDTMEVEEFLKEAAVMKEIKHPNL
EPH (HUMAN):VIGEGEFGEVYRGTLRLPSQDCKTVAIKTLKDTSPGGQWWNFLREATIMGQFS
FER (HUMAN):LLGKGNFGEVYKGTLKDKTSVAVKTCKEDLPQELKIKFLQEAKILKQYDHPNI
IR (HUMAN):ELGQGSFGMVYEGNARDIIKGEAETRVAVKTVNESASLRERIEFLNEASVMKGF
CROS (HUMAN):LLGSGAFGEVYEGTAVDILGVGSGEIKVAVKTLKKGSTDQEKIEFLKEAHLM
TRK (HUMAN):ELGEGAFGKVFLAECHNLLPEQDKMLVAVKALKEASESARQDFQREAELLTML
BFGFR (HUMAN):PLGEGCFGQVVLAEAIGLDKDKPNRVTKVAVKMLKSDATEKDLSDLISEME
RET (HUMAN):TLGEGEFGKVVKATAFHLKGRAGYTTVAVKMLKENASPSELRDLLSEFNVLKQ
EGFR (HUMAN):VLGSGAFGTVYKGLWIPEGEKVKIPVAIKELREATSPKANKEILDEAYVMAS

-------------------------------------------------------------------
You must put a sequence data as "Sequence name:Sequence data"
or ":Sequence data". If no sequence name, ':' is necessary.
You must put return key at the end of sequence.


------ Parameters -------------------------------------------------
We have two best parameters sets.

(1) CutM=80       (2) CutM=85
    CutI=90           CutI=95
    U,V,S=7,1,1       U,V,S=7,1,1

U,V,S -> First Gap Cost -> U+V
         Not First Gap Cost -> V
         Out Gap Cost -> S

(1) is general
(2) is fast
