|
AD01 In latest efforts to understand genes, it is critical to find proteins of similar sequence, structure, and function both within and between species. This process is extremely CPU and disk intensive and requires a complex approach to produce useable result s. We will discuss ways to use SAS to: (1) manage very large virtual drives in LINUX file server clusters, (2) manage WINDOWS-NT clusters of workstations (COW) to simulate a parallel processing. Using layers of file monitoring, job distribution, data ana lysis, and data visualization, SAS can provide clever and cost effective paths to COW management and usage [X command, %MACRO, DATA, STAT]. Parsing the problem, and distributing it symmetrically across the COW and executing the randomized solutions are di scussed in the context of an example from matching proteins within a worm and between a worm and a fly. This approach to proteomics puts supercomputing capabilities in the hands of researchers at a very low cost with very low maintenance. |