Differences between Jal & PICJal
================================

Running the Compiler
====================

Arguments to the compiler:

  -hex arg : set the name of the hex file. The default is source.hex
  -asm arg : set the name of the asm file. The default is source.asm
  -rickpic : I work with the MarkIII, a 16f877 based board. The PIC
             comes with a built-in loader which requires that locations
             0, 1, 2 are not touched
             3 holds a GOTO to the first instruction
  -debug   : show debugging output
  -s arg   : set the search path. multiple paths are seperated by semi-colons
  -nofuse  : don't include the config word in the assembly or hex files.

Parser
======

No extra spaces are required.

The *first* line should include one of the files in the chipdef directory!
Amoung other things, these files define things required by the code
generator (code size, data bank locations and sizes, eeprom location
and size, configuration word location and size, stack depth) and create 
the various ``pragma target'' definitions!


Variables
=========

The following variable types have been introduced:

SBYTE  : signed, 8-bit value
WORD   : unsigned, 16-bit value
SWORD  : signed, 16-bit value
DWORD  : unsigned, 32-bit value
SDWORD : signed, 32-bit value

In addition, other byte sizes are defined as follows:

SBYTE*n : create a signed, n * 8 bit value
BYTE*n  : create an unsigned, n * 8 bit value

True bit fields are also introduced:

BIT     : as with JAL, this creates a *boolean* value.
          Any non-zero assignment sets the value to TRUE,
          and zero assignment sets the value to FALSE

BIT*n   : create an n-bit value. Assignment to this type
          follows bitfield assignment rules. Aka
          BIT*5 n

          n = y

          translates to:

          n = (y & 31)

The byte order when storing variables is LSB...MSB.

Variables are allocated out of all existing banks, and bank setup for
variable access is automatic.

The full format for variable definition is rather complex, but here's goes.
Anything in braces '['..']' is optional; UPPER CASE denotes a keyword;
single quotes denote character constants; cexpr is a constant expression

VAR [[VOLATILE] type] identifier [ '(' cexpr ')' ]
  [ AT cexpr [ : bit ] | variable [ : bit ] | { cexpr1[, cexpr...] } ] 
  [ IS variable ]
  [ '=' cexpr | '{' cexpr1, ... '}' ]

type        one of the predefined types (BYTE, SBYTE, ...)
VOLATILE    This value can change without notice and during assignment it
            cannot be used to hold intermediate results
'('..')'    sets up an array with cexpr entries. The array is accessed
            using identifier(n) where 0 <= n < cexpr
AT cexpr[:bit]    This variable is placed at location cexpr and that location
                  will not be used during variable allocation. If type is BIT,
                  the variable will exist at bit 0 at cexpr unless the [:bit]
                  is used.
AT variable[:bit] Like above, but use the same base as variable
AT { cexpr1, ...} Like above, but this variable is mirrored at multiple
                  locations. Currently this is limited to 4. Any read or
                  write at any of the locations is assumed to be equivalent.
                  (This is used by the databit reduction code).
IS variable       This simply creates an alias. This type must match the
                  original variable type
= cexpr           Assign cexpr to this variable
= {cexpr1,...}    If this is an array, assign the following cexprs to the
                  array.

So, for a 10 element BYTE array x:

VAR BYTE x(10) = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 }

arrays can also be constant, in which case they reduce to a lookup table
as follows:

CONST BYTE x(10) = { 'a', 'b', 'c', 'd', 'e', 'f' }

This will take no data or code space if it is only accessed via a constant
subscript. It becomes a lookup table (requires codespace) if accessed via
a variable subscript.

A lookup table is limited to 255 entries regardless of type (for example,
you can have a 255 entry DWORD lookup table).

A variable must entirely fit within a bank.

Expressions
===========

Expressions follow the C promotion value promotion rules.

Procedured & Functions
======================

  Interrupts
  ----------

  Any number of procedures can contain ``#pragma interrupt''. On an
  interrupt each procedure with ``#pragma interrupt'' is called, though
  there is no guarenteed order.

  A procedure containing the ``#pragma interrupt'' must take no parameters.

  Re-entrancy
  -----------
  All procedures and functions are fully re-entrant so may be called from
  both interrupt and user routines.

  Forward Declaration
  -------------------

  A function or procedure may be forward declared (declared without
  definition) by using the word ``forward'' in place of ``is''.

  Function/Procedure Declaration
  ------------------------------

  No default parameters are allowed.

  No get/set procedures are automatically created when passing variables
  as volatile parameters

Language
========

The OPERATOR keyword is not supported.

The FOR command has changed as follows:

  FOR expr [USING var] LOOP [block] END LOOP

``USING var'' gives access to the looping variable which holds values from
zero through (expr - 1).

Also, expr is evaluated once before entering the block.

Multiplication, division, and shifting are part of the language. Shifting is
always done inline, whereas multiplication and division are implemented as
functions. Only one version of each is created -- the one that can compute
the maximum size required. For example:

BYTE a, b, c

c = a * b

will create an 8-bit multiply function.

BYTE a, b
WORD c, d

a = a * b
c = c * d

this creates a 16-bit multiply function that is called in both cases

Jump Tables
===========

Jump tables are *not* supported. There simply is no obvious way to implement
them. My understanding is jump tables are guarenteed to be withing 256
bytes so an ``addwf pcl'' will always work. This simply isn't feasible
(see the next topic).

It appears most jump tables are really due to a lack of lookup tables.
This is solved above (use constant arrays).

Phases of Compilation
=====================

The compiler consists of three parts

  1. the parser converts text to p-code
  2. the p-code is optimizied
  3. the p-code runs through the back end to finally create the HEX file

the interesting bit is the back end. Initially, *every* variable access
and *every* branch contains the setup bits. During data bit optimization,
all unnecessary data bits are removed. Finally, every unnecessary branch
bit is removed. This is done in a loop:

  do
    remove all unnecessary branch bits
  while at least one bit was removed.

Now, let's say we're trying to guarentee that a bit of code doesn't span
a 256 byte boundary. We could add another phase, align jump tables, but
by doing so we might end up pushing a routine far enough that we need
some branch bits. This quickly becomes a mess.

