HHoocc -- AAnn IInntteerraaccttiivvee LLaanngguuaaggee FFoorr FFllooaattiinngg PPooiinntt AArriitthhmmeettiicc _B_r_i_a_n _K_e_r_n_i_g_h_a_n _R_o_b _P_i_k_e _A_B_S_T_R_A_C_T _H_o_c is a simple programmable interpreter for floating point expressions. It has C-style con- trol flow, function definition and the usual numerical built-in functions such as cosine and logarithm. 11.. EExxpprreessssiioonnss _H_o_c is an expression language, much like C: although there are several control-flow statements, most statements such as assignments are expressions whose value is disre- garded. For example, the assignment operator = assigns the value of its right operand to its left operand, and yields the value, so multiple assignments work. The expression grammar is: _e_x_p_r_: _n_u_m_b_e_r _| _v_a_r_i_a_b_l_e _| _( _e_x_p_r _) _| _e_x_p_r _b_i_n_o_p _e_x_p_r _| _u_n_o_p _e_x_p_r _| _f_u_n_c_t_i_o_n _( _a_r_g_u_m_e_n_t_s _) Numbers are floating point. The input format is that recog- nized by _s_c_a_n_f(3): digits, decimal point, digits, _e or _E, signed exponent. At least one digit or a decimal point must be present; the other components are optional. Variable names are formed from a letter followed by a string of letters and numbers. _b_i_n_o_p refers to binary oper- ators such as addition or logical comparison; _u_n_o_p refers to the two negation operators, `!' (logical negation, `not') and `-' (arithmetic negation, sign change). Table 1 lists the operators. -2- +---------------------------------------------------------+ | TTaabbllee 11:: Operators, in decreasing order of precedence | |^ exponentiation (FORTRAN **), right associative | |! - (unary) logical and arithmetic negation | |* / multiplication, division | |+ - addition, subtraction | |> >= relational operators: greater, greater or equal, | |< <= less, less or equal, | |== != equal, not equal (all same precedence) | |&& logical AND (both operands always evaluated) | ||| logical OR (both operands always evaluated) | |= assignment, right associative | +---------------------------------------------------------+ Functions, as described later, may be defined by the user. Function arguments are expressions separated by com- mas. There are also a number of built-in functions, all of which take a single argument, described in Table 2. +-----------------------------------------------------+ | TTaabbllee 22:: Built-in Functions | |abs(x) |_x|, absolute value of _x | |atan(x) arc tangent of _x | |cos(x) cos(_x), cosine of _x | |exp(x) _e_x, exponential of _x | |int(x) integer part of _x, truncated towards zero | |log(x) log(_x), logarithm base _e of _x | |log10(x) log10(_x), logarithm base 10 of _x | |sin(x) sin(_x), sine of _x | |sqrt(x) _x, _x12__ | +-----------------------------------------------------+ Logical expressions have value 1.0 (true) and 0.0 (false). As in C, any non-zero value is taken to be true. As is always the case with floating point numbers, equality comparisons are inherently suspect. _H_o_c also has a few built-in constants, shown in Table 3. +-------------------------------------------------------------------+ | TTaabbllee 33:: Built-in Constants | |DEG 57.29577951308232087680 180/, degrees per radian | |E 2.71828182845904523536 _e, base of natural logarithms | |GAMMA 0.57721566490153286060 , Euler-Mascheroni constant | |PHI 1.61803398874989484820 (5+1)/2, the golden ratio | |PI 3.14159265358979323846 , circular transcendental number | +-------------------------------------------------------------------+ 22.. SSttaatteemmeennttss aanndd CCoonnttrrooll FFllooww _H_o_c statements have the following grammar: -3- _s_t_m_t_: _e_x_p_r _| _v_a_r_i_a_b_l_e _= _e_x_p_r _| _p_r_o_c_e_d_u_r_e _( _a_r_g_l_i_s_t _) _| _w_h_i_l_e _( _e_x_p_r _) _s_t_m_t _| _i_f _( _e_x_p_r _) _s_t_m_t _| _i_f _( _e_x_p_r _) _s_t_m_t _e_l_s_e _s_t_m_t _| _{ _s_t_m_t_l_i_s_t _} _| _p_r_i_n_t _e_x_p_r_-_l_i_s_t _| _r_e_t_u_r_n _o_p_t_i_o_n_a_l_-_e_x_p_r _s_t_m_t_l_i_s_t_: (nothing) _| _s_t_m_l_i_s_t _s_t_m_t An assignment is parsed by default as a statement rather than an expression, so assignments typed interactively do not print their value. Note that semicolons are not special to _h_o_c: statements are terminated by newlines. This causes some peculiar behavior. The following are legal _i_f statements: if (x < 0) print(y) else print(z) if (x < 0) { print(y) } else { print(z) } In the second example, the braces are mandatory: the newline after the _i_f would terminate the statement and produce a syntax error were the brace omitted. The syntax and semantics of _h_o_c control flow facilities are basically the same as in C. The _w_h_i_l_e and _i_f statements are just as in C, except there are no _b_r_e_a_k or _c_o_n_t_i_n_u_e statements. 33.. IInnppuutt aanndd OOuuttppuutt:: _r_e_a_d aanndd _p_r_i_n_t The input function _r_e_a_d, like the other built-ins, takes a single argument. Unlike the built-ins, though, the argument is not an expression: it is the name of a variable. The next number (as defined above) is read from the standard input and assigned to the named variable. The return value of _r_e_a_d is 1 (true) if a value was read, and 0 (false) if _r_e_a_d encountered end of file or an error. Output is generated with the _p_r_i_n_t statement. The arguments to _p_r_i_n_t are a comma-separated list of expressions and strings in double quotes, as in C. Newlines must be supplied; they are never provided automatically by _p_r_i_n_t. -4- Note that _r_e_a_d is a special built-in function, and therefore takes a single parenthesized argument, while _p_r_i_n_t is a statement that takes a comma-separated, unparenthesized list: while (read(x)) { print "value is ", x, "\n" } 44.. FFuunnccttiioonnss aanndd PPrroocceedduurreess Functions and procedures are distinct in _h_o_c, although they are defined by the same mechanism. This distinction is simply for run-time error checking: it is an error for a procedure to return a value, and for a function _n_o_t to return one. The definition syntax is: _f_u_n_c_t_i_o_n_: _f_u_n_c _n_a_m_e_(_) _s_t_m_t _p_r_o_c_e_d_u_r_e_:_p_r_o_c _n_a_m_e_(_) _s_t_m_t _n_a_m_e may be the name of any variable -- built-in functions are excluded. The definition, up to the opening brace or statement, must be on one line, as with the _i_f statements above. Unlike C, the body of a function or procedure may be any statement, not necessarily a compound (brace-enclosed) statement. Since semicolons have no meaning in _h_o_c, a null procedure body is formed by an empty pair of braces. Functions and procedures may take arguments, separated by commas, when invoked. Arguments are referred to as in the shell: _$_3 refers to the third (1-indexed) argument. They are passed by value and within functions are semanti- cally equivalent to variables. It is an error to refer to an argument numbered greater than the number of arguments passed to the routine. The error checking is done dynami- cally, however, so a routine may have variable numbers of arguments if initial arguments affect the number of argu- ments to be referenced (as in C's _p_r_i_n_t_f). -5- Functions and procedures may recurse, but the stack has limited depth (about a hundred calls). The following shows a _h_o_c definition of Ackermann's function: $ hoc func ack() { if ($1 == 0) return $2+1 if ($2 == 0) return ack($1-1, 1) return ack($1-1, ack($1, $2-1)) } ack(3, 2) 29 ack(3, 3) 61 ack(3, 4) hoc: stack too deep near line 8 ... 55.. EExxaammpplleess Stirling's formula: _n!~2_n(_n/_e)_n(1+1__12___n__) $ hoc func stirl() { return sqrt(2*$1*PI) * ($1/E)^$1*(1 + 1/(12*$1)) } stirl(10) 3628684.7 stirl(20) 2.4328818e+18 Factorial function, _n!: func fac() if ($1 <= 0) return 1 else return $1 * fac($1-1) Ratio of factorial to Stirling approximation: -6- i = 9 while ((i = i+1) <= 20) { print i, " ", fac(i)/stirl(i), "\n" } 10 1.0000318 11 1.0000265 12 1.0000224 13 1.0000192 14 1.0000166 15 1.0000146 16 1.0000128 17 1.0000114 18 1.0000102 19 1.0000092 20 1.0000083