d4.doc

This is generated from a plaintext file,  d4.doc.
Return

D4 DEFINITION 08/2003

Next |End http://www.d4maths.lowtech.org/mirage/install.htm http://www.d4maths.lowtech.org/dna.exe http://www.d4maths.lowtech.org/math2545.tgz (<-c) Tony Goddard, SHEFFIELD 2003 Info: +44 (0)7944 764312 E-mail: tony@lowtech.org D4 is a an array processing language based on APL, but with features from PERL and the C-shell found on UNIX systems. The interpretor runs on DOS/WINDOWS 95 and WINDOWS 98, and also on LINUX/X. A preliminary version ran on SUN/SPARC workstations. DNA.EXE has been downloaded on a variety of WINDOWS/NT style platforms in colleges and offices, and the system works. Some scripts which use #cmd "shell_command" constructions need adjusting for Windows/NT. For example, to create another terminal box, use the respective alias: WINDOWS_NT ntclone|#cmd "cmd /c start d4x v=14 test.afn ", ARGS WINDOWS xclone |#cmd"start \windows\desktop\d4x.pif" LINUX xclone |#cmd #deb "rxvt +sb -e sd4 v=3 tty=rxvt ", ARGS, " &"

RUNNING THE INTERPRETOR.

Next | Back|Top|End On DOS/WINDOWS D4 comes in two versions: a 16-bit version for creating rescue disks and suchlike, and a 32-bit version for handling large arrays in virtual memory. The 16-bit version boots from floppy disks, and I have used it in Digital Art installations. The 16-bit version is normally D4T.EXE, and is compiled under Borland Turbo-C. D4X.EXE is a 32 bit version available for DOS/WINDOWS. This was compiled with DJGPP, the Gnu C-compiler adapted for DOS. To run under DOS it is necessary to obtain the DPMI extender csdpmi3b.zip from an ftp site. A good place to try for European residents is:- URL=ftp://sunsite.doc.ic.ac.uk/packages/simtelnet/gnu/djgpp/v2 The simplest way is to run D4 from a command line. Use the DOS prompt in MS-Windows, or run in a shell window under UNIX/LINUX. d4x file-name {options} .. or d4x for LINUX. The file should be a .D4 program file (test.afn, dsh.afn). The options consist of a list of name=value pairs or ordinary strings. Some names are reserved and using silly values for them will cause trouble. default S,s stack size 512 N,n names 512 W,w windows 50 F,f files 32 K,k if defined, startup in input mode. Don't try and load int.afn.. I,i if defined supress break-in checking. For use in batch files. The normal startup line is:- d4x filename {var=value} ... If the paramater is of the form name=value then the variable called 'name' is given its designated value. Otherwise the arguments are stored in variables $0 $1 $2 etc. $0 is the name of the .exe file, and $1 is the name of the .afn file, which is a D4 script. There is a default startup file, 'init.afn'. If this file is in the current working directory, it is used as the startup file. When the K= paramater is used, the program comes up with no prompt, and the user must type D4 commands to get the program to work. On UNIX systems some name=value pairs need to be enclosed in single quotes to prevent the shell from destroying the intended effect. For example, to pass the printer device to d4 use a line such as:- d4x '$LP_DEV=/dev/lp1' test.afn Under WINDOWS-95 and WINDOWS-98 the interpretor may be run by clicking on certain files when viewing with a browser. To do this it is necessary to make an entry of the table which associates file types with programs. It is also possible to create a desktop short-cut. All of these methods mean storing a command line somewhere. LINUX users will find a shell script called 'dsh' which starts off like a PERL script with the name of the interpretor following a #! at the beginning of the file. Avaiable only with the LINUX version are 'socket' functions for sending datagrams between processes. At the moment these functions are only adequate to implement 'dumb clients' and a single 'dumb server'. These functions have been put in to facilitate video-wall installations. Environment variables are passed to the program in a virtual array called $ENVP. Type $$ENVP in the editor to see the list of names.

D4 SYNTAX.

Next | Back|Top|End D4 may be run in command mode. Commands may be either expressions, system commands or user defined functions. The simplest expressions involve operations on numeric or character data, in the form of vectors or matrices. The syntax of expressions is very simple:- E::= + | + E | E + E | L - +::= F | S | U Function or operator -::= ++ | -- Postfix operator F::= Primitive function (+ - * % Etc.) S System function ( #window, #files, #sed, Etc.). U User defined function. D::= N | T Data N Numeric Matrix or Vector T Text Any array of bytes E::= I Identifier E::= +E Monadic operator E::= E+E Dyadic operator E::= (E) Priority of evaluation J:: E | E: | :E Index operator E::= E[J] Indexing. E::= L<-E Assignment L::= I | I[J] L-Value I::= Identifier. An identifier consists of a sequence of upper case letters, digits, the underscore (_), dollar sign ($), and dot (.). The starting character must be either a letter, dollar, or underscore. The use of identifiers allows the combination of various data types without the overhead of creating files. Variables may also take names which are not recognised as identifiers. These variables cannot be used in expressions, but they may be used in cut and paste operations. These names can be passed from shell scripts. A numeric token is a sequence starting with a digit or sign and including an optional decimal point and an exponent string consisting the letter 'E' followed by a number which may have a sign (+ or -). Because the minus sign, '-', is used as an arithmetic operator the negative sign in numeric tokens is the underscore ,'_'. There is a reason for this: the parser recognises vectors as lists of numeric tokens separated by blanks. This means that the parser must be able to distinguish between minus ('-') as an operator and a minus sign ('_') as the sign of a numeric literal. Text literals consist of text embedded between double quotes ("). If the double quote is itself to be included in a string then it must be doubled. The expression """in quotes""" becomes "in quotes". This can be most annoying for users who wish to write scripts with many doublequote characters. There is no mechanism for string literals to contain control characters, tabs, or to extend over more than a single line in the script. It is however possible to initialise string literals to contain multiple line constants in the form of 'here documents' similar to those found in PERL or the UNIX shell. Statements consist of one or more expressions, separated by '&' or ';' symbols. A series of statements may be made into a function. Functions may have 0, 1 or 2 arguments, and local variables as required. This is specified in the first line of the function, which acts as a template. Functions are defined in ".afn" files. These are ASCII files, with function definitions seperated by blank lines. Functions may also be created or modified with the editor #sed. A typical command might be #sed"X:d;$xx;:e;:f". Examples: one parameter two parameters no parameters monadic dyadic niladic Z<-FACT N Z<-A PLUS B Z<-HELP |Factorial |Add two numbers Z<-#sed":r abc.hlp" ->(Z<-N=0)/0 Z<-A+B Z<-#sed":e" Z<-N*FACT N-1 A function definition is a line consisting of two parts: the function definition template, and a list of local variables. <Fd>::=<FT> LV_LST <FT>::= I<-I | I<-I I | I<-I I I <Lv_list>::= <empty> | ; I <Lv_list> Here <empty> stands for a null text string, that is to say a text string with possibly no characters at all. The symbol 'I' stands for an identifier as defined in the section on syntax. Functions can become quite long. Because of the total absence of control structures such as if, then, for, while etc. Long functions are best broken up, or even written in C and added to the binary. The only control structure within a function is the branch, or goto instruction. Many short loops can be coded by recursion. Define a function which calls itself. Most functions consist of sequences of statements and loops of the form:- I<-0 AA: ->(I=N)/BB #mkdir #deb D<-LST[I:] | Restore file system heirarchy I++ ->AA

INTEREACTIVE SHELLS AND SCRIPTS

Next | Back|Top|End The interpretor can read functions and data from scripts. The first parseable line in a file is taken to be the invocation of a startup function. Lines consisting of one or two upper case words will work. These functions use labels and sub-functions. Labels stand at the beginning of a line, and are separated from the statements in a line by a colon. Labels are named variables, and their value is equal to the number of the line on which they stand. By tradition in APL the function definition line is line zero, and the other lines are numbered starting from one. The expression '->0' or goto line zero is equivalent to return from the function. It is possible to chain together statements without writing functions. A text matrix can be treated as a set of APL statements, and run as a sub-program by means of an APL function which manages a line counter, and executes the statements in succession. Alternatively a table of statements may act as a sort of 'case' statement with one particular line selected to give an action. This is achieved by the primitive function 'x' (execute) or #do.

DATA TYPES

Next | Back|Top|End Traditional APL recognised a small number of fairly homogeneous data types. These were character data, integers and real numbers. D4 represents these in the machine with one, four (or two on 16-bit version) or eight bytes respectively. Given a symbol "A" the function #nc"A" gives the name class of the viriable as 0 for undefined and then 1,4,8 etc for the type of object. A character scalar represents a character with ascii value from 0 to 255. A character vector is any number of characters grouped into a string. A character matrix is a table, or array of characters with a certain number of rows and columns. Atomic data types (int, double, char) are grouped in only two possible ways. These are scalar, vector, and matrix. There is no difference between a scalar and a one element vector. The null object may have any type.

RUN FILES AND WORKSPACES

Next | Back|Top|End Run files are ASCII files which define functions and data. Workspaces are collections of functions and data objects. Every workspace contains some pre-defined objects which are created when the interpretor is invoked. The most important of these is the symbol table which contains the names of all other objects, and the data associated with an object. There are at least three pre-defined symbols: a stack, a window list, and a table of file descriptors. A list of the names these objects may be obtained by the name list function, '#nl'. Users should not attempt to modify the stack or other such objects. Further information is in APPENDIX C. Run files are ASCII files consisting of blocks of text, or paragraphs, separated by blank lines. A paragraph consists of a header line followed by zero or more non-blank lines. Perl users will understand that scripts are files divided into paragraphs with the '\n\n' separator. Formatting programs, including many mailing programs will will generally destroy the validity of a run file. When the interpretor reads any text file as a run-file it processes paragraphs and sometimes creates an object. Comment text is skipped. There are three types of object, with the type determined by the first line of the paragraph. The objects are data items, functions and startup text. Functions and text matrices have names. The startup text is simply a set of expressions on a single line. Comments can be either UNIX shell style comments or C++ comments. These are blocks of text where all lines start with '#' or '//'. On UNIX the first line of a file maybe a comment giving the name of the binary. Example: #!/usr/local/bin/d4x or #!/home/d4/cp/d4x. Object type Header Subsequent lines (1) function text <Fd> Function body (2) data text "I" Text matrix "I" >> string Here Document (3) startup text <Fd> none. old style #do Text on following line. A text matrix and also a function body is initialised by a sequence of non blank lines. If a text matrix needs to include blank lines it must be defined as a 'here document'. The header line consists of three tokens: "name_text" >> terminator. The name of the object is in double quotes, and the terminator is any string. The object is made up of subsequent lines of text up to but not including a single line containing the terminator.

A SIMPLE SCRIPT FILE

Next | Back|Top|End Example: A run file to multiply two integers. "times.afn" contains four lines. TIMES;#quit Z<-TIMES Z<-$2*$3 The parser reads the paragraphs. The first line TIMES;#quit just contains two function calls. The first function, 'TIMES' is defined later in the script. The second is the built in function #quit which exits the program. The body of the function 'TIMES' simply returns the multiplication of the two strings $2, $3 which are arguements passed on the command line for the interpretor. The run file can be activated by typing the command:- d4x times.afn 2 3 6 d4x times.afn 8669133453942347 81668888774549265858 707998495821741647976208724707488726 In the above example the variables $0, $1, $2, $3 represent the positional paramaters of the command line just as in the UNIX shell. Since $0 is the program name and $1 is the name of the runfile arithmetic arguements start at $2, $3. It is also possible to use file redirection. TIMES;#quit Z<-TIMES $VA[0]<-($2*$3),#av 10 13 |Write product with end of line Z<-"" d4x times.afn 8669133453942347 81668888774549265858 > result.num type result.num 707998495821741647976208724707488726 File handle 0 is standard input for reading and standard output for writing. When using standard input it is necessary to set the shell to 'batch' mode. This is done by use of the command line parameter i=0. TIMES;#quit Z<-TIMES;$0;$1 Z<-$VA[0] |Read a line from standard input 0 r #split Z |Split into a pair of numbers $VA[0]<-($0*$1),#av 10 13 |Write to standard output Z<-"" |Do not echo the result of the function. type num.tmp 34555435 7723456 d4x times.afn i=0 < num.tmp 266887381783360

TAR FILES: PARSING BY STEALTH

Next | Back|Top|End An ordinary tar file can be made into a self extracting script. Prefix a function whose name is a single upper case string to any UNIX tar file. The contents of the first component of the tarfile is simply a script which contains the definition of a function with that name. The body of the function can include instructions to unpack the tar file, and then run further scripts depending on the contents of the tar file, and the installation of software and data on the current computer. This can be useful for installing new software on the computer, or for exchanging encrypted data between host and client machines. There are several ways of creating workspaces or adding objects to workspaces. The most common ways are to run the D4 interpretor from DOS/WINDOWS or LINUX, or to use the #load or #copy functions when the interpretor is already running. The #sed function can also convert the text buffer into variables and functions. Use the ':f' exit. This means that functions could in theory be loaded in from embedded comments in other types of documents, for example HTML. The '#load' and '#copy' commands both take a file name. Both functions read objects into a workspace from an ASCII file. The '#load' function deletes all existing objects before reading in the runfile and then starts the interpretor with the startup text. The #copy function merely adds functions and variables to the workspace. Numeric data types are not supported by these functions. When the interpretor is run from the operating system (DOS or LINUX) the first paramater on the command line is taken as the run file. The effect of the DOS command 'd4\gc\d4x \d4\cp\test.afn' is equivalent to the D4 commands:- _UFILE<-$1<-"\d4\cp\test.afn" $0<-"\d4\gc\d4x" #copy _UFILE startup_text Runfiles without startup text are often given the file suffix '.d4f'. Runfiles and other files containing definitions can be prepared by using any editor or wordprocessor capable of saving ASCII text. The interpretor will correctly process files stored as either DOS or UNIX text files.

PRIMITIVE OPERATORS

Next | Back|Top|End Arithmetic: Result <- op S #abs absolute value #atan angle between line y=0, x>0 and points y,x. ? Random X<- ?N is a random draw 0<=S < N #ln Log natural logarithm p power R<- pX (#exp X) exponential function. #sqrt root R<- #sqrt X same as #exp 0.5 * #ln X #circle +/ Sum R<- +/X sum vector, or columns of matrix */ Product R<- */X f/ Minimum R<- f/X Minimum of vector, or matrix c/ Maximum R<- c/X If X is a matrix, the operation is done for each column. To get the results for each row, use the transpose operation 'y' : R<- +/yX etc. Arithmetic: Result <- A op B + add These operations work on - subtract integer matrices or vectors. * multiply % divide f minimum floor c maximum cieling #mod modulus remainder #and and bitwise #or or < less > greater = equal also works on character objects != not equal o circular functions.

ARITHMETIC ON VECTORS AND MATRICES

Next | Back|Top|End Arithmetic operations work on numeric vectors and matrices. APL and its dialects are not strongly typed languages. Declarations are generally not necessary. Data is defined by assignment, or via the side effects of functions. A vector is a set of numbers, arranged in order. The assignment A<-3 4 5 creates a vector with three elements. These can be written as A[0], A[1], A[2]. A vector with a single element is called a scalar. A matrix is a table of numbers with a specified number of rows and columns. If X is a numeric object then its size is given by the function 'r' standing for rank. In the example A<-3 4 5 the size of A, written rA is three. Notice that no space is necessary in the expression rA. Function symbols are either signs such as +,-, *,% etc, or lower case alphabetical letters. Variable names used in expressions cannot contain lower case letters. Arithmetic on vectors and matrices is carried out componentwise. In the case one of the arguments is a vector with a single element (scalar) then that scalar value is used for each component of the larger operand. For eaxample if A<-3 4 5, B<-7 4 1 then the arithmetic operations A+B, A*B, A+1, 2*B give the following results. A+B 10 8 6 A*B 21 16 5 A+1 4 5 6 2*B 14 8 2 If C is given the value C<-12 73 8 3 then the operations A+C and B+C give an errors. Arithmetic operations can only be used on vectors or tables of equal sizes, or between vectors/matrices and scalars. There is no priority of evaluation for arithmetic expressions. Each expression is evaluated from right to left, so that 3*4+5 gives the result 27 (3*9) rather than 17. Numbers are represented in the computer with several different formats depending on the implementation. These correspond to integer and double precision floating point values. Integers may be 16-bit integers (d4t) or 32-bit integers (d4x). Floating point values tend to be represented according to a common IEEE standard. The interpretor decides how numbers are stored depending on the existance of a decimal point. Small floating point numbers can be converted to integers by means of the function #av. Z<-"N" #av 3.412 gives 3. Older versions of APL had boolean values which were integers that were allowed to take the values 0 or 1. Sometimes implementors would try to store vectors and matrices of boolean values as packed bitmaps, but the tradeoff between space and time makes this an unrewarding excercise. Bitwise logical operations on integers are supported with #and and #or. 7 #and 6 gives 6. 12 #or 2 gives 14.

INTEGER ARITHMETIC ON DIGIT STRINGS.

Next | Back|Top|End The operations +, -, * (multiply), % (divide), <, >, and #mod (modulus, or remainder) work on text vectors treated as digit strings. This means that the PC user can simulate calculations made in the 1950s and early sixties, and much more besides. The classical example of recursive functions is the factorial function n!=1.2.3.4.... Not only can this be calculated to hundreds of digits, but the user can really see the improvement of divide and conquer algorithms over straightforward iteration.

MATHEMATICAL FUNCTIONS

Next | Back|Top|End APL provided primitive operators for common functions used in mathematics. The system functions #atan, #exp, #ln and #sqrt and #circle give most of these. The #circle function may be abbreviated by the single letter 'o'. Trignometric functions are given as dyadic functions where the first numeric arguement is a selector. Current values are:- Selector Function call result 0 R<- 0 o X floor (X) 1 R<- 1 o X sin(X) 2 R<- 2 o X cos(X) 3 R<- 3 o X tan(X) _1 R<- _1 o X arcsin(X) _2 R<- _2 o X arccos(X) _3 R<- _3 o X arctan(X) R<- #sqrt X square root of X R<- #ln X natural logarithm R<- #exp X exp(X) R<- #atan YX arctan2 (Y,X) with sign. The #atan function works on pairs of numbers. The result is only half the size of the arguement. It got added to facilitate the conversion of cartesian to polar coordinates. Note: Domain errors are trapped, but ignored. This means that calling a function such as #sqrt with a negative arguement will not stop the program, but there may not be an error message. These functions were all introduced to facilitate low resolution animations of dynamical systems where evaluation will take place at many points. Singular points will give random results but the show will go on. When in doubt operate the brain rather than the program.

STRUCTURAL OPERATORS

Next | Back|Top|End Monadic: Result <- op X r Enquire shape. rX is the shape of X r: Shape of object, as a matrix. i Index generator iN numbers 0 to N-1 iN,M pairs (i,j) i: 0 to M-1, j: 0 to N-1 shape of result is (M*N),2 , Ravel Convert matrix to vector. y Transpose R<- y X Matrix transpose. m #phi R<- m X Flip columns #theta R<- #theta X Flip rows Dyadic: Result <- A op B r Shape X<-4 5 r 1 2 3 Make 4 x 5 matrix , Concatenate Z<-X,Y Join X and Y i Search IDX<- A i B Position of B in A t Take R<- (M,N) t X Make new object, M rows, N columns. d Drop R<-(M,N) d X Get rid of M rows and N columns of X. / Reduce R<-U/X Compress. Keep columns with U[j]>0 /: Reduce R<-U/:X compress rows. j #find R<- STR #find TEXT String search. R is a 0 1 vector. m #phi R<- N m X Rotate each row #theta R<- N #theta X Rotate each column. b #base R<- RADIX b TAB Similar to APL encode n #represent R<- RADIX n Z and decode. z format R<- FW z X Numeric -> ASCII R<- FW #fmt X FW = width, decimal places.

POSTFIX INCREMENT AND DECREMENT

Next | Back|Top|End Postfix operators: R<-X++ X++ increment. X-- decrement. A[J]++ works on subscript vector J.

SELECTORS, SUBSCRIPTS AND INDEXING

Next | Back|Top|End Unlike many other languages such as BASIC or C subscripts may be vectors or word lists. The square brackets '[' and ']' are used to indicate subscripting into any array, or table. Only one level of indexing is allowed. Subscripts are normally integers or integer vectors, but strings, or tables of strings may be used as subscripts for associative arrays. When subscripts are used with a table, or matrix a whole row is returned for each index element. All subscripting operations use the value 0 to index the first element of an array. Out of range subscripts do not generate errors. On the right hand side of an expression they give zeros or blanks, depending on the object type; for assignments (X[J]<-DATA) out of range subscripts act as no-ops. R<-V[I] Select elements with subscripts I R<-M[I] Select rows from matrix, columns from vector. V[I] <- DATA Fill positions in the list I from DATA. V[I:]<- DATA Fill rows I in vector as 1 row table. M[I] <- DATA Set rows in positions indexed by I. M[I]++, M[I]-- Increment or decrement elements indexed by I. R<-M[:J] Select columns of vector/matrix. V[:J]<-DATA Set columns of vector/matrix. The colon modifier is a compromise. Normally a table could be indexed by two sets as in TAB[ROWS:COLUMNS], but the parser cannot handle this. In particular, lists of files or directories may have only one row, so it is important to have an easy way of selecting all the elements of such tables. If a table has just one line it is treated as a vector. To force the interpretor to treat it as a one line table use the colon modifier on the rank function 'r'. To process all files in a directory use a construct such as:- FL <- #files DIR, "*.txt" ->(0=r,FL)/XX I<-0 & N<-1tr:FL | r:FL is the shape of FL as a table. AA: ->(I=N)/DONE X<-FUNCTION DIR, FL[I:] I<-I+1 & ->AA XX:

ASSOCIATIVE ARRAYS AND LISTS

Next | Back|Top|End D4 makes use of associative arrays and lists. There is a list associated with each window. It's lines may be accessed by using $T[Range]. A variable 'NN' may become a symbol table, indexed by strings. Any occurence of NN[name] as an L-value sets the name in the table. The complete list of names is given by the function NAMES<- #use NN. The name list for an associative array can also be extracted with the '$' include command for the editor. R<- size #use NN Declare name as array[size] R<- #use NN List of names in use NN[str] <- val Set value val <- NN[str] Get value. Undefined returns "". When size is the pair (2,N) names are associated with values. When size is a number N, the table is used for storing word counts. The index set may be a matrix. NN[idx]++ Increment word counts. NN[idx] <- RHS Assign value to set of names. The index variable can be a matrix of names, with respective values taken from the RHS if it is a vector. To make a frequency count of the words in a file use a function such as the following:- D<-NFC F;AN;C.0;N;Z |Frequency count items in a list. AN<-#sstomat _CR,LOAD F 1200 #use "C.0" C.0[AN]++ | This does all the work. Z<-#sstomat #use C.0 N<-,C.0[Z] D<-Z," ",zyN D<-D[m #sort N] |Sort by frequency count There is also a virtual array associated with each open file. For file number N use the syntax $VA[N, {RP}] where RP is a record pointer. $VA[0], $VA[1] and $VA[2] have special meanings.

PROGRAM CONTROL

Next | Back|Top|End #goto ->EXPRESSION if the expression is an integer within the line range of a function, then goto that line. If the expression evaluates to an empty object, do nothing. Line numbers outside the range of a function cause a return. If EXPRESSION is a string, then do nothing if the string is empty, or else branch to the first label function which matches EXPRESSION. Labels in the function may be quoted strings followed by a colon. This serves as a sort of case statement. #do #do EXPRESSION EXPRESSION should be a text vector, which contains text to be interpreted. This function is used in the D-shell and also in menu handling. (function call) Call a function simply by writing its name and paramaters as required.

NEW CONTROL STRUCTURES: IF, WHILE, WEND, BREAKIF, REPEATIF

Next | Back|Top|End Because APL parses from right to left this form of program control is effected by functions. IF expression; statements IF is a monadic function. If the arguement evaluates to false, the remaining statements on that line are skipped. WHILE expression; ... ; statements; WEND WHILE is a monadic function. Its arguement is evaluated to true (1) or false(0), and a conditional branch is made to the line following the next WEND within the current function. WEND is a niladic function, giving an unconditional branch to the preceding WHILE statement. BREAKIF and REPEATIF are monadic functions whithin the body of a WHILE ... WEND loop, and effect conditional branches to either the line after the end or the beginning of the loop.

MENU FUNCTIONS

Next | Back|Top|End K<- (WW,K) #qmenu TEXT WW A vector containing a window specifier: y, x, height, width, {starting line } TEXT A matrix giving a list of options. returns an index, or -1. P <- PZERO #zgets WLIST PZERO initial selection WLIST a matrix of points or window specifiers. returns an index. K <- #sed STR STR is a string of editor commands including the command ':e' for full screen editing. K <- k Input a single key. All menu functions work via use of the cursor keys. For 'select' and 'exit' use the keys carriage return (CR) or Escape (ESC). The Meta or Alt key generally has no effect. The LINUX version supports the mouse when running under X windows. Microsoft versions must use #int 51, as shown in some example scripts. The definitions are in ../inc/na.h and ../inc/htext.h while the implementations are defined in the source files ../screen/*.c . Mouse button one generally works as 'select' while other buttons generally cause 'exit'. Mouse support uses the fact that the return value of a keystroke is an integer. Real characters only use one byte, so all other values can have special meanings.

DIRECTORY FUNCTIONS

Next | Back|Top|End #files A<-16 #files"*.*" gets directories. A<- #files"\*.*" root directory. The result is a matrix, and it can be used in the menu functions. The left parameter is the search attribute, as known to DOS. This function will display all files on a DOS system, and depends on who is running the program on UNIX systems. This means that the function will find files such as MSDOS.SYS on PCs. Do not try modifying such files unless you know what you are doing. #fstat S <- #fstat name This function returns the UNIX status of a file. S is a bit significant value, and it contains the value returned by the operating system. Examples:- #fstat "xyz" no file 0 (5r8)n #fstat"d4.doc" this file 0 0 6 6 6 (5r8)n #fstat"d4x" executable 0 0 7 5 5 (5r8)n #fstat"/home" a directory 4 0 7 5 5 FILE FUNCTIONS Smaller files. DATA <- #zload F read file R <- DATA #zsave F write block to file Files containing functions: #load filename re-start using functions in file. #copy filename add new functions from filename. Larger files. #close, #read, #fnames, #write, #open, #fsize. A filename F may be a full pathname. Non zero results from #open, and #write indicate errors. End of the input file is indicated when the number of bytes returned by #read is less than requested. Any file is treated as an array of bytes, just like the low level file functions of C. The size of a record is specified by the read function (#read). Any ordinary file consists of a header, and then records. The #read function takes four parameters: the file tie number (N, which is 0 for stdin, stdout), the the offset at beginning of file (start), the record length (size), and the record pointer (rp). On 16-bit systems, this enables large files to be processed. Changing the value of size allows many records to be acessed and displayed on a screen. R <- F #open N, mode Open the file F. N is the tie number, used in #read and #write. The mode has the values N_READ (1) for read only access, N_USE (3) for read/write and N_STR (5) for stream files. DATA <- #read N, start, size, rp fp = start + rp * size DATA<-$VA[N, start, size, rp] DATA <- #read N Next line in stream files. A value of -1 indicates end of file. LINE<- #read 0 Read next line from standard input. R<- DATA #write N, start, size, rp write data at start + rp * size rp = -1 for end. $VA[N, start, size, rp]<-DATA R<- DATA #write 0 Write to standard output file. R <- "" #open N close file N #close NLIST R <- #close n1,n2,...nk close files in list R <- #close close all files #fnames R<- #fnames Give table of numbers, names R<- #fnames N R<- #fsize N, BLOCK R is a pair of integers (NB,R) and the Actual file size is R+NB*BLOCK. A file can be set to zero length by using the #zsave function, before using any of the other file operations. These functions are very primitive and the file size is limited by the size of 16-bit integers on DOS systems. The (offset, size, pointer) triple is subject to restrictions: allocating large blocks does not work well, so 16K is a maximum reasonable value. The #read and #write functions simply translate the (offset, size, pointer) triple to a long integer, giving about 30-bit byte addressing within a file. Although it slows down operations, files are opened and closed for each separate operation. Z<-READ_TTY PROMPT Get a line from terminal with echo. This new function is meant for scripts where full screen IO is impossible, for example on dumb terminals. The optional arguement is the prompt.

SOCKET FUNCTIONS

Next | Back|Top|End Z <- HOST #socket SNUM, MODE open socket Z <- "" #socket SNUM close socket Z <- #socket return socket table. HOST is a numeric 4-vector such as 127 0 0 1 and (SNUM, MODE) is a pair of integers giving a socket number and a mode. The mode is the same as for files: 1 for read, and 3 for write. To close a socket it is necessary to use the syntax Z<-"" #socket SNUM. The expression Z<-#socket returns the internal table used to manage sockets. Data is passed between processes by the magic variable $SOCKET[] which is syntactically a subscripted variable. DATA<- $SOCKET[SNUM, MAXBYTES] get data from socket $SOCKET[SNUM, TARGET]<-DATA send data to TARGET TARGET is a 4-integer host number.

WINDOWING SYSTEM

Next | Back|Top|End Memory mapped video functions. SIZE <- #screen SIZE <- #screen ROWS, COLS Set screen size. This function allows for the size of the text screen to change. Because different fonts can give anything from 18 to 600 rows of text some mechanism is necessary to keep track of the screen size. R <- #wget WW R <- DATA #wput WW These functions read text from a sub-window, or write text to a sub-window. On UNIX these functions work, although a memory mapped screen is simulated. A window specifier works in full screen text mode, with values WW = y, x, height, width in characters. On DOS systems #wput may be used to set attributes such as the colour of the text by making DATA have an integer type. R <- BM #kjput CZ, {colour} R <- #kjget CZ, dy, dx These functions work with bitmaps in graphic mode. The data is rasterised: that is to so consecutive bits in a row are packed as bytes. The pixel size of a bitmap is (rows, columns x 8). The bitmaps are aligned on byte boundaries. EGA/VGA video modes must be selected by the DOS interrupt 16 (10H). There is also a method of associating areas of the screen with transcripts. A transcript is an array of strings. The #window function creates a structure defining a screen area, a language dependent input method and an empty transcript. The lines of a transcript may be accessed either means of subscripting a special array ($T) or by use of the editing function #sed. The expression $T[IX] gives a matrix consisting of the indexed lines in the transcript of the last referenced window. The selected screen area gives a view onto the transcript. The variable $V controls what is seen. $V[0] Current line. The data is $T[$V[0]] $V[1] Horizontal offset. You see (0,$V[1]) d $T[$V[0]] $V[2 3] Cursor position within the window. Once a window number is selected, the value may be omitted from functions such as #window, #sed, #nfind etc. when the last selected number will be used. The window corresponding to 0 is special: it is a default for some functions. Usage: Result <- DATA #window FC, { HANDLE } FC function code HANDLE an optional number specifying a window. FC VALUE comment - - R<- #window usage list. int vector. DEL 0 #window 0 { ,N } delete INIT 1 WW #window 1,N setup window n. MODE 2 A #window 2 set orientation, attributes SIZE 3 WW #window 3, N Re-size. The mode value is used for different script types. At the moment the main options are right to left writing, and selection of an alternative keyboard mapping. The input modes are bit significant for the time being, but that will change. The alternative keyboard mapping is also toggled by the ^A key.

CURRENTLY SUPPORTED WINDOW FUNCTIONS 30/03/1998

Next | Back|Top|End fc function code wn window number. Less than 'W=n' paramater at startup if omitted, then use the last set value. Niladic form R <- #window Gives a vector R so that R[j] = 1 if window in use, and R[j] = 2 or 3 if window is selected. R[j] = use(j) or selected (j). Monadic form R <- #window i0 USE as niladic form. R <- #window 0, wn DEL delete window IMODE <- #window 2, wn IMODE get input mode WW <- #window 3, wn AREA area for display Dyadic form Z <- WW #window 1, wn INIT setup window R <- IMODE #window 2, wn IMODE set input mode R <- AREA #window 3, wn AREA set area of display

TEXT EDITING FUNCTIONS

Next | Back|Top|End #sed R<- N #sed CMD R<- #sed CMD KEY<- #sed The edit string contains a command. Some commands work globally on the file. A blank command just shows the contents, in the last assigned window. Some commands give results to D4. Others give return codes, including a (-1) from cursor setting commands to signify end of file. If the command string starts with an 'X', the rest of the string is parsed for semi-colon seperators (;) and each substring is passed as an edit command. For example the command "Xs/cat/dog/*;s/white/black/*" will change all occurrences of 'cat' to 'dog', then all occurrences of 'white' to 'black'. GLOBAL COMMANDS - these work on the whole file :b break lines at newline characters. :c {v|m} cut {vector | matrix} :d{t} delete buffer (also :q). :e edit :f exit and create function. :i file include file at cursor :o name cut to named object :p {n} paragraph size :r file delete transcript, then read new file :s show :u file write as unix file ('\n' line break). :w file write file. DOS style line break. g/str1/ select lines containing str1 v/str1/ delete lines containing str1 s/str1/str2/{g} substitute with regular expressions These functions do not use regular expressions S/str1/str2/{a} global replace The 's' and 'S' commands take the postfix options: * global replace i.e. 'g' and 'a'. g replace once in each line a replace all occurences in a line. String find and replace functions return a count of the number of strings found, or replaced. The delete function (:d) may take a modifier. ':dt'deletes trailing blank lines of a transcript. LOCAL COMMANDS - these work on the current line T tag current line D delete tagged range C copy tagged range to current line M move tagged range to current line o name cut tagged block to named object d move current line to scrap. i.e. delete. d/str/ delete until line containing /str/ y copy current line to scrap p pop line from scrap, before current line j join two lines. i text i/text/ insert string after cursor r/text/ replace current line r text replace line . {number | /str/ } set cursor or goto line . n goto line n LC <-n . /str/ goto line containing string .z end of file LC <- size-1 -n go back n lines +n go forward n lines @ R<-T[LC++ ] @ n R<-T[LC<-n] $ var fetch object Fast string operations. These do not use regular expressions. The delimiter character may be /,|,[ or ]. These commands are repeated via the ^N key, when in the editor. /str1, /str1/ find the string str1. /str1/str2/ replace str1 with str2 #print Print data to buffer R<- {S} #print X S<-S,X #lprint Send data to printer port. R<- #lprint X #rxfind Search string for regular expression R<- STR #rxfind DATA R is an integer pair. R[0] starting position of STR in DATA R[1] count of matching characters. DATA[R[0]+iR[1]] is the string matched.

EDITOR KEYS

Next | Back|Top|End ^A alternate language ^B F2 exit ^ stands for the control key. ^C exit (copy range) The ALT key has no function. ^D exit (delete range) INS insert mode (on +, off -) ^E delete to end of line CR - cursor down ^F F3 find / replace CR + break line ^G exit (get object) DOWN ^L exit (follow link to file) DEL delete character ^N find / replace next DEL join two lines ^O exit (make object) END end of file ^P F4 push line ESC command mode, exit ^Q F8 exit (quit) HOME start of file ^R F5 restore line PGDN next screen ^S F10 save changes PGUP previous screen ^T F6 tag line range TAB tab forwards ^X F7 exit (move range) UP ^W exit (width) F9 ^Y exit Find syntax: the command '/str1' will find the first occurrence of 'str1'. Use ^N to find the next occurrence. A command '/str1/str2/' will replace 'str1' with 'str2'. Use ^N to progress through the file. If you wish to change a string containing a slash, then edit the first character of the command. To change '/*' to '|', simply type S'/*'|'. Editing may occur in English or from right to left. In addition a set of rules may be given for composite characters. What happens during editing is controlled by a window mode. Fonts are obtained from bios calls on PCs, or selection of a font for a whole process window (in which D4 runs) on UNIX systems. Keyboard tables may be set so that the alpha-numeric keys give any desired code. The editing mode byte is bit significant: 0X01 map keys into upper character space (0X80-0XFF). 0X02 reverse orientation: right to left input. 0X04 reverse input (push characters forwards). 0X08 Allow composite character formation. ($ALPHA) 0X10 Modify characters on input ($BETA) 0X80 Inverse video current line. The keyboard tables also map all the cursor movement keys to single byte control codes. This means there is a shortage of such keys for the editor, so some keys have a dual meaning, depending on the insert mode. In particular the carriage return splits lines in insert mode. Keyboard translation is achieved by setting the environment variable $KB, a 256 byte table. It is quite possible to disable the keyboard completely by injudicious use of this function. Linux users may enjoy trying to set up the environment variable $KB_SEQ. This is a text matrix where each row defines a single key value followed by a count prefixed string. This structure is set up by scripts which interrogate the terminfo database. The compiled program uses a hard coded version of 'linux' console keys. It would be easy enough to change this.

INPUTTING TEXT

Next | Back|Top|End #inlin input at cursor position. Force use of existing edit functions. k R <- #inkey {wait_time} R <-k {wait_time} The #inkey function returns an integer, the ASCII value of the last key pressed. This function also returns the key used to exit the #zgets function. When a wait_time paramater is specified the function returns the value 0 on time out, without waiting for further input. #zgets J<- {J_START} #zgets YXLK Allows selection from a series of field specifiers. YXLK is an NF x NC matrix, with at least two columns. This function pushes the last key.

GRAPHIC FUNCTIONS

Next | Back|Top|End The Chinese character functions #kjput and #kjget work with packed rasterized bitmaps on byte boundaries for the EGA/VGA type displays. By using #kjput an entire scanline, or group of scanlines may be written. Any image which can be assembled line by line may be written (slowly) to the screen. It seems unneccessary to have any more graphic functions. Because of the nature of a multiscript system, most printing gets done in bit-image mode. This means that the graphics functions enhance the printer rather than the video. The video display functions are included merely to give a preview of what is to be printed. Some line drawing functions are included. They work on sets of points which are represented by Nx2 matrices. Sometimes a set of points can be computed by adding displacements. The video is seen as a grid, with the 'taxi-cab' metric. A move table is a set of displacements: North, South, East, West. D4 is quite suited to doing calculations on these arrays. Two functions return these arrays. DZ <- #kjvec A,B |A|>|B| gives |A| points line Ax-By = 0 DZ <- #kjarc R points on arc (PI/4) with center at (0,R). For compatability with the text handling functions the co-ordinates are represented as (y, x) pairs. The arrays DZ represent move tables. The only possible values are -1, 0 or 1. A set of points may be displayed by means of the #kjplot function. Z<-CM #kjplot Z plot points Z<-CM #kjplot Z,Z+(rZ)rDZ draw rectangles, or squares Z is a set of points (y,x pairs) and CM is a vector containing one or two values: the colour and the drawing mode (PSET, AND, OR, XOR). When Z is an Mx4 matrix the function draws rectangles. There are two block fill functions: #kjbloc and #kjfill. Z<- CC #kjbloc ZB ZB is an Nx4 of rectangle specifiers. CC defines the colour and drawing mode. The fill pattern is taken from the matrix $TILE. If Z is a set of points then HZ<- #kjfill Z gives a set of one line tiles which cover the interior of Z. These are stored as triples: (y, xa, xb) with xa < xb. The function would be very easy to write if there was enough memory to store every pixel in what could be a quarter megabyte array. The difficulty comes from the fact that the list of interior pixels is compressed as a set of horizontal tiles. See the file <delta.d4f> for examples.

LOW LEVEL FUNCTIONS

Next | Back|Top|End #peek R<- #peek ADDR ADDR is a N x 2 integer matrix, containing segments and offsets. The function returns a vector of byte values, read from successive locations. On 32 bit systems ADDR is simply a vector of virtual addresses. DATA #peek ADDR Set the specified addresses. #nc R<- #nc NAME Name class. An integer, giving the type of the object specified by NAME. 0 for undefined, 1 for text, 2 or 4 for integer depending on 16/32 bit versions, 8 for double (floating point), and so on. The values are fully specified in the header file <na.h>. #varptr ADDR <- #varptr VARIABLE_NAME 16-bit: Segment:offset for data of NAME 32-bit: Integer as virtual address of contents of NAME Use in #call for 16-bit systems. Use as a test for which interpretor is running. #ifdef R <- #ifdef NAME Empty vector if the there is no object called NAME, otherwise the value of NAME. Not useful. It is the same as R<-"" #ifndef NAME. #ifndef R<-VAL #ifndef NAME If there is an object called NAME, then return that value, otherwise use the left argument. This allows functions to take 0, 1 or 2 parameters. #ex Expunge. R<- #ex NAME Release space taken by object called NAME. There is still a slot in the symbol-table for name, but #nc, #ifdef, #ifndef will work as though NAME is now undefined. Note: The functions #nc, #varptr, #ex will also work on text matrices which serve as lists of names. #int Machine interrupt Z<- (AX,BX,CX,DX) #int n Z is value of returned registers AX,BX,CX,DX #call Interface with machine code. Z<-(x1,x2,x3,x4,x5,x6) #call TEXT TEXT is a set of bytes containing 8086 instructions. These instructions should be included: start PUSH BP set base pointer MOV BP,SP access parameters x1, WORD PTR[BP+06] x2, WORD PTR[BP+08] x3, WORD PTR[BP+0A] x4, WORD PTR[BP+0C] etc return POP BP RETF The 80x86 instructions may be created with the 'debug' utility. First a text file is created with the the debug command a100 followed by the assembler instructions, a blank line, and then the commands u100, q. Next debug is invoked with the command line 'debug < file'. A listing appears on the screen with the code bytes somewhere on the left of the screen. These may be cut by means of the 'ww' alias, which cuts a rectangular block from the display screen. After elimination of blanks these bytes give the required machine instructions. #ts Time stamp. R <- #ts DATE-TIME as year, month, day, hour, minute, second. This is in the British format. #port comm port functions R <- {DATA} #port function {portnum} {option} Don't use. Wait for #socket when that gets done.

DATA CONVERSION

Next | Back|Top|End #av ascii-codes. R<- #av S Convert R<- "N" #av S Double -> Integer R<- "D" #av S Integer -> Double Also could use number of bytes.

FULL LIST OF FUNCTIONS

Next | Back|Top|End R <- #atan YX arctan on pairs (y,x) R <- #av x ascii/numeric conversion R <- (X1,X2,X3,X4,X5,X6) #call txt DOS/WINDOWS 80x86 code. R <- #circle X trignometrical functions. R <- #close handle(s) close files R <- #cmd command dos/unix command R <- #copy "filename" add functions R <- #cursor cursor position #cursor Y,X set cursor R <- #del file delete a file R <- A #equiv B A == B (type, size, value) R <- #ex idlist erase objects R <- #exp X exponential function R <- #fi chardata evaluate numeric text R <- pattern #find data string search R <- string #fmt data numeric format R <- #fstat pathname returns file mode bits for pathname R <- #ifdef name (name defined) ? value : empty R <- EXPR #ifndef name (name defined) ? value : EXPR R <- #inkey get key R <- (registers) #int interrupt dos interrupt R <- #ln X Natural logarithm R <- #lprint X send character vector to printer R <- #kjarc RADIUS PI/4 arc of circle at center (R,0). R <- #kjfill Z return interior as horizontal tiles R <- #kjget WJ,SI get bitmap from WJ[], colour SI R <- (SI,M) #kjplot Z plot points at Z, coulour SI R <- BM #kjput CZ,SI put bitmap at CZ, colour SI R <- CM #kjvec DZ draw line segments #load file load function file R <- {FS} #mattoss matrix to delimited string R <- #mkdir DIR make a directory R <- #mouse YX mouse support. R <- #nc idlist name class R <- {WP} #nfind string line numbers of string in $T[WP] R <- {WP} #nxfind pattern line numbers of pattern $T[WP] R <- #nl type name list R <- name #open handle, mode open a file R <- #peek locations memory access R <- {bytes} #peek locations write to memory R <- #read num, start, size, rp read bytes from file R <- #rmdir DIR remove directory R <- pattern #rxfind data returns location of string in text. R <- SUBS #rxsubs STRING Regular expression substitute. R <- #screen {size} get / set current screen size R <- N #sed STR edit transcript[N] R <- #sleep DELAY takes 2-vector: seconds,microseconds R <- HOST #socket handle, mode initialise a socket N <- {FS} #split V Split text into fields $0 $1 .. $N-1 R <- #sqrt X square root N <- {FS} #sstomat V Split text into a matrix R <- {size} #use name hash table as array R <- #val string numeric conversion R <- #varptr idlist like varptr (basic) R <- #ts date and time R <- #wget ww get characters R <- #window vector window functions R <- data #window vector data #wput ww write characters to subwindow R <- data #write h,start,size,rp write file data R <- #zgets wlist form input R <- data #zputs wlist paste text in windows

APPENDIX B -- EDITOR CONTROL KEYS

Next | Back|Top|End

SINGLE KEY OPERATIONS

Next | Back|Top|End Some of these operations may need to be included in the script file. Those marked with a * must be translated to #sed commands. This list replaces termcap code in the program. When some cursor keys, or function keys don't work, then the Ctrl+ selected key will do the trick. F1 ^A alternate mode - Right to left language F2 ^B help - exit ^C * copy block ^D * delete block ^E delete to end of line F3 ^F find functions ^G * get object ^H delete character ^I Tab forwards ^J Jump (PgDn) ^K Begin file (Home) ^L * Goto line, or follow link. CR ^M Carriage Return ^N find next ^O * object - Cut F4 ^P delete line -- push F8 ^Q quit F5 ^R restore line F10 ^S save exit F6 ^T tag modes ^U page up (PgUp) ^V Insert mode ^W * width F7 ^X * move block F9 ^Y ^Z End (End key) Ctrl + '^' cursor up. Ctrl + '/' cursor down. Ctrl + '\' cursor forwards Ctrl + ']' cursor backwards. Ctrl + '[' Escape key. Ctrl + '_' Leap into the dark.

APPENDIX C. PREDEFINED NAMES

Next | Back|Top|End $0 Name of executeable $1 Runfile $2, $3, .. other parameters $$[J] get $[i] as an array element $ALPHA Character composition table. <./text.doc> $BETA Character modification table. $GZ Pixel cursor. $ENVP Virtual array environment variables. $FSTAT Virtual array: gives fstat[file_table] $KB Keyboard mapping. $KB_SEQ UNIX systems. Terminal control sequences. $LP_DEV UNIX systems set to /dev/lp0, /dev/lp1 etc. $MEM Virtual memory as an array $NL system symbol table $QUOTIENT quotient evaluated during R<-A #mod B $REMAINDER remainder calculated in R<-A%B $SCR[] Text screen $SINK All output X is treated as $SINK #print X $SOCKET[] Socket $STACK_TRACE Stack trace $S[] Size of $T $T[] Current transcript as an array. $TILE Fill pattern (packed bitmap). $V[] Viewport for current text window. _UFILE Current run file. D4LIB_PATH search path used in #copy"filename" EV.KB Key-board input event handler. EV.ALX Break-in handler. var1, var2 .. var=val pairs on command line $stack Stack $nfiles File descriptors $nww Window descriptors $vs Virtual screen on 32-bit systems. $inline Line input buffer. $nsock Socket table pointer WHILE start iteration WEND end of iteration BREAKIF conditional breakout REPEATIF conditional restart SRAND set random seed User supplied error handlers may be installed in a script. These deal with break-in, and command input from the key board. Two variables may be defined in the runfile, or even the command line. Each of these variables should hold an expression, which will be done in the case of either the system expecting a command from the user, or when the Control Break keys are pressed. Normally the system only waits for user input when an error has occurred. It is possible to set these variables to refer to functions, or code sequences which will set the key-board to english, type an error message, or return to the main menu. An example of the simplest error stratagy is: EV.KB<-EV.ALX<-"GO" Where GO is some function. $ENVP[] variables are reset in the environment on startup.

FUTURE DEVELOPMENTS

| Back|Top|End Write scripts to use /usr/include symbols for writing device drivers. More socket stuff. Package manager. Info style documentation. Autoconf and config style scripts.
Back to the Top