Re: virus: Level Three-Belief and Utility.

zaimoni@ksu.edu
Sat, 2 Nov 1996 11:23:01 -0600 (CST)


On Fri, 1 Nov 1996, Jason McVean wrote:

> KB wrote:
> > Having read up on standard approaches: even isocoding by hand gives
> > superior results. [This is a highly nonstandard approach.] It would not
> > surprise me that genetic algorithms can do better than incompetent
> > standard approaches.
>
> Can you give me a basic starting point reference for this
> ("isocoding by hand")? I'm not familiar with the term but it
> sounds as though it is relevant to my work.

I will have to do a reference search ASAR.

Actually, I was improvising. None of the math professors I've mentioned
this to have called me in on it.

The basic ideas are this:
Programs ARE mathematical objects. The version we really are
interested in is the one the computer actually executes, but it is
necessary to use source code to make development reasonably possible. [A
good database or spreadsheet program IS a programming environment, it
just doesn't look like one.] Several theoretical justifications for this
are possible; none I have seen are really useful in practice except for
technicalities. The most important such technicality is that
simultaneous substitution invariably is hard to define correctly.
I define two "fragments of source code" as isomorphic if and only if
they are indistinguishable: for all possible legal inputs, their output
is IDENTICAL. Robust programs consider resource failures to be inputs
that must be dealt with. Execution time, RAM used for code, and RAM used
on the stack or dynamically allocated are not outputs, they are resources
used to generate outputs.
I define "isocoding" to be a process that is supposed to optimize
the resources used [execution time, RAM used for code, RAM used on stack
or dynamically allocated] without altering ANY program behavior. [If you
want to alter program behavior, that is either adding new functionality
or debugging.] The exact details of this process depend heavily on the
language used.
My implementation of the above is that I want to have a "library" of
generic isomorphisms of source code. These isomorphisms form the rules
by which I mutate the program. I apply natural selection via the metrics
named above, combined with the role of the code [error-handling code
should be fast to bypass, not fast to execute] to evolve the program to
superior performance. This is somewhat related to genetic programming,
actually. It is essential to comment out the parent variant while
testing the mutated version, because the mutant may well be inferior.
While I have adapted a number of standard techniques for
optimization into this framework, the overall metaphor is not conducive
to this.

[I work in C++, and have set the compiler to generate mixed source
code/assembly/machine listings; between that and the assembly manual, I
can often determine the effects of a isomorphism/mutation without
actually executing the code. This is also useful, just to detect
compiler errors (yes, I've seen THOSE. I must remember to put a chapter
on the Web containing an ad for Crash Resistor C++.)]

The key limitation of the above is that, by definition, it is not
supposed to alter program behavior. Assuming that I don't want to do
that, it has a slight advantage over a genetic algorithm in that all of
the mutations are guaranteed to preserve behavior; the selection process
is focused on the resources issue.

//////////////////////////////////////////////////////////////////////////
/ Towards the conversion of data into information....
/
/ Kenneth Boyd
//////////////////////////////////////////////////////////////////////////