Towards Super Logic
This is an initial finite propositional logic interpretation of traditional human brain based logic, math and philosophy in a form suitable for implementation in a well defined intelligence such as a computer. See Blind faith math and negative feedback neural net . There are currently 2 sections: - Simple fundamentals - mostly finalized..
- Traditional calculus - converted half way.
1. Fundamentals (Feb 97) - Names: Each unique name is a unique nerve whose stimulation may stimulate or inhibit another nerve thru a unique synapse. Names are used to activate nor-formulas or inhibit all formula parts except those where a new 'nor' is to be added using the 'experience' of a new lesson.
- Nor-formula: Integrated circuit( Transistor-Transistor Logic nor-gates) connected 'nor' elements - inputs wired to only one output - experience an integrated circuit with an oscilloscope - see -ANN(negative feedback artificial neural net ). The -ANN, being written in 12 pages of JavaScript, can easily and completely be experienced by a human. It can also be experienced by another -ANN by email. This experience is independent of human control, biases, intuition, contradictions, infinity or supernatural timeless concepts. This simplicity, independence and finitely observable responses is the basis of super-logic. In order that any NF in -ANN may be extended in any way by a new synapse between any 2 nerves, each synapse must be on the end of a separate nerve that can be inhibited by another uniquely stimulated nerve called its inhibit name. This 'intermediate' nerve is connected to any source nerve by a 'copy' type synapse and for convenience may have a 'copy' connection to its individual name nerve. The standard learning routine creating any new synapse is: stimulate one nerve, designate this as source nerve by waiting, stimulate 2nd, wait to designate 2nd as a destination, {relax 2nd to designate copy synapse}, wait for 1st to find 2nd(seek and join using special nerves), relax 1st( negative feedback reward), wait to reset ages solidifying connection. Thus all inconvenient connections in a NF may be inhibited so they don't confuse learning a new connection.
- Equal inputs: The 'Super-set'. All those inputs to a NF that result in the same output. The inputs are said to be equal for the NF. In traditional notation, {x|NFa(x)=y} where NFa(x) is a nor-formula 'a' with 'x' a particular pattern of stimulations and non-stimulations(relaxations) of a's inputs and 'y' is the resulting output.
- Finite propositional logic induction: This is the super-logic filtering of traditional, human brain based 'mathematical induction'. Consider a subset S={s1,s2,...sj} of the inputs I to a given FPL formula F(a nor-formula) and O={o1,o2,...oi} to be all the outputs. Here sj and oi have values of 't' or 'f' only. And I-S is considered fixed.
- F outputs Oa for SOME particular S pattern of input 't' and 'f' say Sa.
- If when F outputs Oa for Sb it also outputs the same Oa for Sb' where Sb' is Sb with ANY ONE variable changed( sk'= not sk for any one k).
- Then F outputs Oa for ALL permutations of 't' and 'f' in Sa.
- Finite Continuation: This is the super-logic filtering of traditional 'limits'.
- Non-Cantor set comparison: Lately I have avoided the need to use Cantor's IMAGINED INFINITY to compare the size of sets: he decides equality[equipotent] if there exists a 'rule' to generate a 1 to1 mapping between any elements of the sets. If the sets have an infinite number of members, this 'rule' implies an infinite number of steps or operations to generate ALL the pairs (n,2n); thus it is not a proper rule for a finite intelligence. Only a supernatural intelligence like a Turing machine with an infinite memory and infinite speed 'could' generate all the integers and then reorder them into the desired pairs. My idea is to compare sequences of FINITE sets using rules that require only a finite number of operations(could be lessons for -ann). For example in comparing the 'set of all integers' to the 'set of all evens' consider a sequence of finite sets of pairs of equal integers: {1,1,2,2}, {1,1,2,2,3,3,4,4}, {1,1,2,2,3,3,4,4,5,5,6,6}... then rearrange integers only within each finite set:{1,2, 1, 2}, {1,2, 2,4, 1,3, 3,4}, {1,2, 2,4, 3,6, 1,3,5, 4,5,6}...This gives paired integers with evens and odds & left over integers. The verification of this is to implement it in an exactly defined & completely observable machine - super-logic. It is clear that at each step the number of odds & left over integers equals the number of paired integers and evens. So the limit(finite continuation) of this comparison of even integers to all integers is 1 to 2; not 1 to 1 by Cantor's non-'finite contenuation' method.
One might object: 'why not use tripples: {1,1,1, 2,2,2, 3,3,3 }and add the last 2 of each tripple giving ( 1,2, 2,4, 3,6 }. Shows evens are same size as all?' But one can say of this: 'twice as many numbers were used to get these evens. Hense, again, there are half as many evens as all'!
Phew! That was close. I need to prove in nor-formulas, the consistency of 'finite continuation'. Can I use 'material math induction'? Can that be put into a nor-formula? To start with 'material math induction' requires that all sequences are limited to what a specified memory can hold. And the sequences are the order the values are stored in memory at a given time. Then the above is just the sequential outputs of a computer program. And MMI is just a way of predicting those outputs by analysing the program. It is clear that we can transform sequences of integers increasing by 1 into varous numbers of evens. Then it is vacuous to compute the relative numbers of various sets of results. I have to say that the question of the relative number of evens to all integers is not well defined for a well defined intelligence! - Contradictory infinity:
- Integers: Consider "we can always get another integer by adding 1 to any given integer". This would seem to be true for both of us. So I take my integer as yours +1 but if you take your integer as mine +1 or I give it to you, does either of us get an integer or something always changing? I think most of this is cleared up by specifying exactly to what it is given, how given and how accepted and when = lesson for -ann. Problems with given: Is the integer explicitly printed out like '256' or given implicitly as 2^8? In either case does the intended recipiant have the ability to remember it or figure it out? If explicitly given, the number must not be so long that one can not remember it or can not wait until a computer 'finishes' printing out the last of a billion billion digits. If implicitly given, the instructions must be 'understandable' by the intended recipiant in a finite time.
- ... : Consider B the set of all binary numbers. Usually this is considered to be a valid idea, for humans. But does B contain D=111... an infinite string of ones? There are several troubling ideas here:
- How can we be sure 'all binary numbers' is contradiction free or useful?
- Is the infinite thing D well defined? Cantor 'defines' this by changing the diagonal digits of an ordered table of all the binary numbers.
- Can a infinite memory, infinite speed 'machine' like the Turing machine generate 'all binary numbers'? Can it finish generating D? It seems we could 'define' infinite speed and infinite memory to be whatever is required for the machine to finish following an infinite set of finite instruction steps. Certainly it could finish D by just storing another '1' after the last one. But can it finish an infinite number of infinite strings of digits? It might seem that finding a place to start storing a second infinite string after memory is filled with the first would be impossible. But consider the scheme: interleave the digits of different strings so there is an infinite number of starting points. That is, after taking the next digit of the longest waiting string take the first digit of a new string. Then we only need a scheme to systematically start generating all binary strings. I say 'strings' to avoid the conundrum of 'all binary numbers' possibly not containing D since some people might not like to consider D a number. Thus we can consider the normal binary numbers to be made infinite strings by following them with an infinite string of '0's. We just need to start making strings with the binary numbers as we start storing their digits in the above interleave scheme. One may still ask "where do we start storing D". I am tempted to answer"My finite intelligence does not allow ME to answer but the Turing maching with its infinite speed and memory will find the place".
- Quantifiers: Let UxPx symbolize P is true for unbounded, ordered x: That is, if P is true for y, there always exists a x>y so P is true for x. Where (x)P symbolizes P is true for all x, (x)Px implies UxPx. For contradiction, let Gx be set of all positive integers >x. Let Px be the statement:Gx is nonempty; by mathematical induction (x)Px so UxPx. But take any integer n, there exists an x>n so Gx for unbounded x does not contain any x !
- I have come to the idea that significant infinity arguments can be equivalently restated as finite arguments with a special ordering of choice[see below] of uniquely, finitely defined elements.
- Infinity arguments: G.Cantor's, A.Turing's, Godel's, Dedekind's, R.Penrose's, G.Caitin's, etc., don't hold for well(finitely) defined, simple, materialistic intelligences see negative feedback neural net . They give human proof of the unreliability of traditional human based logic, math, philosophy.
- 'Higher' order logic theorem 'proving' computer programs only seem to prove because they are not exactly defined in computer language a low, finite propositional logic: Any HOL proved by computer is at most a LOL theorem. It is equivalent to a finite digital transform - see 'math in nor-formulas'. Any method of defining terms that can be just as easily defined oppositely is not based in the medium by definition. Only those terms translated into digital logic give results proved by it - a computer may print out any variables as 'GOD DOES EXIST' or '1=1' and it does not have to be rewired or the word processor reprogrammed to print out 'GOD DOES NOT EXIST' or ' 1=2' as this very display demonstrates. Computer variables may be exactly defined as the flip or flop of its flipflops - see -ANN flipflop lesson.
- However, finite state automatons cannot exactly model the universe because they are a recursive part of it and will ultimately run out of memory. But, one may still find an optimum for a specific circumstance.
- Finite Continuation:Given a set of sequences of finite sets where some properties may change but others remain fixed, then the fixed properties that remain are called the finite continuation. In detail: given sequences{s1[n], s2[n]...sj[n]} and given properties Pcv, Pfv in { Pc1(s1[n]), ...Pc1(s1[n])}, {Pf1(s2[n]),...Pf2(s2[n]}, properties that are true independent of n are true continually for n: if Pfv(sv[m]) is true for any m then Pfv(sv[m]) is true for unbounded m, is true 'in the limit' is the finite continuation.
- Globally finite: [24Dec96 corrected Oct 18 2003]All the definitions and their uses fit in a given automation's lifetime. All uses of definitions require less steps than the maximum cycle period of the automaton.
- Order of Choice: [26 Nov 96] nx is to indicate in a proof the nth order of choosing a value for x: for n<m, values are chosen for mx after nx. For example: let Gx be a set of all the positive integers < or = x. Examples:
- Difference of Gn without Gm is Gn-Gm for m<n. But in old terminology, Gn-Gm for all m,n which I call G, is empty; because 'all m,n' implies we have already chosen n or m larger than any subsequent choice to test content. Can we assume anyone is capable of choosing a number larger than anyone can subsequently choose? Normally, to show G is empty one chooses any positive integer i after one has constructed the set [this construction is only in the physically undefinable imagination], then since we have already chosen some j > i there is already Gj-Gi that does not contain this any i so G is empty! But no Gn-Gm n>m of G is empty! This use of 'all m,n' is an infinity argument.
- Cantor's diagonal: Consider making table B of all binary integers and making another binary number, D. If we choose to put any binary number in the table before choosing to make another number using digits from B, we can make a number not in the table, otherwise we don't. That is, construct D by chosing the ith digit from the jth number in B and changing it. Then choose any kth number of B; can D be different from Bk? The answer is yes. And we did not need to know what infinity was. If we physically make another table B after D is made, D could then be physically added to the 2nd B.
- Computer cycling:(1page display, 3k javaScript). Deterministic computers with finite memory and output reentered as input will ultimately start repeating outputs. Since there are only a finite number of possible internal states(binary numbers), input is left constant and each succeeding state follows uniquely from a predisessor, some state must be ultimately repeated and all between. The maximum period is less than 2^(total memory bits including its clock)=all memory set to 1's. Every real computer cycles by 'finite continuation'. Further we could postulate that a computer is speeded up in proportion to its memory size so that the time taken for its cycle is constant by finite continuation, but both are still finite.
- Axiom of choice: Traditionally, for any given collection of sets, a set can be made from a unique member of each set of the collection. In super-logic, everything must be uniquely and finitely definable. The set of all these definitions can contain only finite definitions. This cannot be infinite since that implies unbounded definitions.
- Rearranging Non-convergent series: The idea is to start with an 'equivalent' sequence of finite sets. For example, instead of the poorly defined {1,-1,1,-1...} use {1,-1}, {1,-1,1,-1},... so we can have a sequence of rearranged finite sets: {1,-1}, {1,1, -1,-1}, ... It is now clear that the limit of any rearrangement and sum applied to each member of the sequence is zero so the 'finite continuation' for this sequence is zero. [Originally conceived in Colorado College Math class 30 years ago].
- Super-rationals: This is to make rigorous the idea (after Konig 1905) of the class of numbers that can be defined uniquely by a finitely reentrant definition. Humans may believe that these numbers are countable by an alphabetical ordering of their definitions. The sequences include all 'diagonal' generations that are finitely reentrant. They include all rationals constructable from the integers. They do not include exact irrationals since they imply ratio of infinite integers.
- Finite definitions: To make -ann lessons(see flipflop) for defining numbers or anything else, I propose each number can be activated(non-inhibited) by a unique nerve taken for its name and another unique nerve taken for the unique output desired when the stimulations satisfying the definition are met. Finite definitions can be replaced by a finite list of the defined.
- Finite recursion: nor-formulas are exactly augmented by more connections when only the needed nerves are activated: planned inputs are from named nerves connected to the nor-formula inputs by other nerves that are actively inhibited during the lesson connecting them. Interactive -ANN:(2page display, 8k-js, 1 to 8 hour study) Any finitely recursive(nerve states become constant after a finite number of steps) nor-formula may be replaced by a nor-formula that requires only one step: just reproduce the final output pattern resulting from the initial input.
- Super-language: I take the approach that language could be a kit of named building blocks for any model construction. -nn is motivated to use a model construction kit that most often results in nor-formulas that reduce the constancy of the inputs(increase entropy faster). It is assumed that the kit is efficient enough to provide extensive successful experience to automate its use. Here is where super-logic and super-philosophy are developed: collect a kit of named nor-formulas that compactly span all possible mappings of sequences of inputs to sequences of outputs. There are nor-formulas that help putting the blocks together - grammers. They should allow modeling a self-contained, uncaused, unbiased universe and the -nn's interaction with it to increase entropy sooner - the anti-model. The grammar(building rules) of a super-language for a given artificial net includes 'non-self-reference' and the nerve stimulation relaxation sequences required to make synapses.
- Rigorous calculus computations without 'completed infinity':
- Example of parabola(x^2): At an arbitrary x, the lines that intersect the parabola in 2 points have slopes of 2x + h; h is the non-zero distance(+ or -) between the 2 points. These lines that lie on both sides of the parabola for some neighborhood of x are called transversals. Thus transversal slopes include all values except for 2x (since h is any value but 0). Thus the non-transversal of the parabola at x has slope 2x. And we did not have to think of 2 points becoming 1 point as h 'approaches' 0.
- Limit: Traditionally: if in real neighborhoods of X and F, there is a formula d(e)>0 for any e>0 so d(e)>|X-x| implies e>|F-f(x)| then F is named the limit of f(x) as x aproaches X. This does not require a 'completed infinity' of x 'becoming' X or f(x) 'becoming' F - a great improvement over earlier definitions. But 'real' causes problems for -nn lessons: I think we need to specify(finite define) a finite metric space so 'real neighborhood' means 'all points in the space except the one they are close to' and just finitely refine this space as needed.
2. Transversals
Dec96 This proposes simpler calculus ideas that permit theorems similar to theorems about differentiable functions to be made for only connected and bounded functions;
and perhaps even generalizable to families of sets without a metric. The basic math is designed for ultimate rigorous redefinition as lessons for a neural network.
All 'curves' or functions below are considered to be connected and bounded. All 'lines' are considered to be straight. Chords on a curve are lines through 2 points of the curve. Definitions
A line that intersects a curve at a point is said to be a transversal if for SOME neighborhood of the point, one half of the line lies on one side of the curve and the other half lies on the other side and they have only this one point in common. For graphs of 2 dimensional functions, a transversal is either above the function on the left of a point on the graph and below on the right OR below on the left and above on the right. the former is a negative slope transversal and the other is positive. A transversal is embedded if in some neighborhood of the point, other points of the transversal are in the exterior of the graph.
Indirectly: A non-transversal line is any line through a point on a curve which is NOT a transversal at that point.
Examples: tangents to a circle are non-transversal lines; they are also 'support' lines. ALL lines through an inflection point are transversals(through origin of x^3). The x-axis is a non-unique, non-transversal line at the origin of x(sin(1/x))but is unique for x^2(sin(1/x)). In the former, transversals have all slopes between -45degrees and plus 45. In the latter transversals have all slopes but 0.
Directly: In ANY neighborhood of a non-transversal POINT separating a curve into 2 connected parts, one side of the non-transversal LINE has points of both curve parts. Call the latter 'points' non-transversal neighbor points. In the generalization to surfaces below it will be convenient to take these points to be on the boundary of the neighborhood. The idea of on a SIDE of a curve and SEPARATION of one curve into 2 parts by another, seem closely related to CONNECTIVITY: a set S may be said to be connected in set X relative to a set of curves C provided any 2 points of S in X also are contained in some c of C and c is wholly in X.
Theorems
- For every chord of a curve there is a parallel non-transversal at a point between its ends. ( like mean value theorem for tangents).
Proof for curves as functions: subtract the chord from the curve. Boundedness of the curve-chord implies there is a bounding(support) line which is parallel to the chord.
- The converse of theorem 1 is also true. (but converse is not true for tangents since there are tangents at inflection points and have no parallel chords). Converse: for every non-transversal line and every neighborhood of the non-transversal point there is a parallel chord subtending it (the non-transversal point lies between the chord's curve points). This depends on the connectivity of the curve; continuity is not needed. Proof: pick the point of 2 non-transversal neighbor points which is closest to the non-transversal line and pass a parallel through it. By connectivity, this parallel must intersect the curve between the non-transversal point and the other non-transversal neighbor point.
- A curve is differentiable at a point if and only if there is at most one non-transversal there. Proof: derivate numbers at a point imply and result from non-transversal neighbor points for a line of intermediate slope through the point. The derivative is the only slope transversals don't have.
- A function is continuous at a point if it has both + & - slope transversals there.
- A function is continuous at a point if and only if a vertical is embedded on both sides.
- A function is monotonic iff it has horizontal transversals at all points.
- A function is differentiable 'almost everywhere' if its transversals have bounded slope.
- An inflection point defined as a point with only transversals; it is imbedded in a connected set of non-transversal points unless the curve is straight.
- A function is increasing if and only if its non-transversals have positive slope.
- A convex function has no inflection points and has at least one non-transversal at every point.
- Corollary to differentiability 'almost everywhere' of bounded monotonic functions(Lebegue): bounded monotonic functions have at most one non-transversal almost everywhere.
- If a surface is jointly continuous, convex in one direction and differentiable in that direction then these derivatives are also jointly continuous. Proof: use theorem 1 and 2 and show that the slopes of chords parallel to the tangents approach each other because of the joint continuity of the surface and that the chord lengths may be kept from zero.
- Generalizations of the 'lines' in 'non-transversal lines'.
These lines have been considered 1st order polynomials: a+bx for a and b that keep the line on the same point(p) of the curve. So lets consider those polynomials through (p) that do not simply cross over the curve(non-transversal neighbor points are below/above the curve for some neighborhood). The limit of these coefficients equals the corresponding coefficients of the Taylor's series expansion of the given curve.
For 'differentiating' a surface, one could compare it to 2 dimensional polynomials by considering non-transversal neighbor points below/above it. The 'derivative' would be the polynomial coefficients. For 'parallel' 'chord' polynomials, let's consider non-intersecting polynomials of the same order as the 'non-transversal' polynomial.
Netized Non-Transversals (pre 1997)
So far I have found it possible in math applicable to finite, explicit neural nets to replace ideas that seem to require infinite steps or elements see 'finite continuation' above. Also, I will try to keep to simple sets of points without distance as this seems close to simple -nn lessons. Non-transversals don't have limits in their definition. But parallelism used in some statements seems to need distance unless one can be satisfied with the boundaries of nested sets.
(It may be more productive for -nn to design tautologies and specialized languages instead of trying to adapt traditional human oriented language. But here goes:). - Net Set : A 'nor formula' is activated by stimulating its 'name' nerve(net set activation name). Each set of stimulations of input nerves of an activated 'nor formula' may stimulate(through synapses) a unique nerve. Normally this unique nerve inturn stimulates an output, an answer name: (net set membership). It is common human habit to use one name to recall a set definition and to respond with the same name when one detects a member(point) of that set. For example: one may crudely define a car as a motor vehicle with 4 wheels, 2 seats, 2 doors. Then when one sees a car one may respond by saying "car" or "red car",etc. (The word "red" may be the result of a different simultaneously activated 'nor formula'.) This may clarify to humans the super logic distinguishing of 'activation name' from 'result name'. Thus a net set is defined to be all those neural net input stimulation patterns that stimulate the same result name. See -nn definition + diagrams(8 page + 3*6k gifs).
- Set boundaries(curves): For starters I will try something close to standard ideas: I'll try intersections of sets(like a curve is the intersection of 2 surfaces); but that means a boundary of one set requires another set: it is relative to another set. So we can first define a space as a family of net sets for definition of surfaces, curves and boundaries relative to it.
- Family of well bounded nested sets: The 'well bounded' condition is to mean any 2 member sets of the nested set do not have common boundary points and the sets are not members of any included net set.
- A family of well bounded nested sets is said to be a fully nested set in the space if for any point in the space there exists one and only one set with that point on its boundary.
- A 'non-transversally' connected curve is a curve for which theorems 1 & 2 above are true. Or using a more general form of theorem 2, a set of points S is -nn(negative feedback Neural Net) connected relative to a fully nested set F if S is not a boundary of any set of F and if for any two points of S in the boundary of any f1 of F there then exists a third point p in the boundary of f2 of F where f2 contains f1 and no other point of S. It may be necessary to add the converse if it can't be proven: given any unique p in S and in f2 there exists an f1 in f2 containing other points of S. This is an example of 'generalization by reduction' (please see definition of -nn with link below)
for full definition + diagrams of -nn (8 page, 3*6k gifs)
Computer cycling:(1page display/3page netscape2 JAVASCRIPT) All computer programs have a finite cycling period; programs cannot compute their own period or need to: questions Turing's, Godel's and Penrose's theories for real computers.
Russell's Paradox:(1page display/2page netscape2 JAVASCRIPT) Computer just cycles name of catalogue of books that do not cite themselves.
E-mail author: R Massey for additions Oct 2003.
top/home (40k)
math 28apr96.