Thursday, 10 October 2019

Magic Numbers

Generally they're a bad thing. Magic Numbers in code are when someone has put + 3187 in an expression or something, and you don't know why. There seems to be a modern trend of "not commenting anything", on the apparent grounds that the code is self evident. Invariably it isn't.

Anyways, I wrote this script in python which tries various mappings as a way of finding a simple method of mapping the BASIC tokens onto P-Codes.

After some tinkering, I found the following :-

  • Firstly reduce any tokens of values 187 or more by 39.
  • Secondly deduct 33 for any character tokens (e.g. 0-127)
  • Thirdly deduct 123 for any tokens (e.g. 128+)

This rather neatly maps all the tokens on characters onto 0-59 (room for expansion, of course) which leaves 60-255 free.

The reason I want this is there 4 other P-Codes which are internal - these are branch, branch if tos = 0, 16 bit push, 6 bit push (-1 to 62, 1 byte) and 15 bit push (0-32767, 2 bytes). The last pair (6 and 15 bit pushes) take up 192 of the available opcodes. However, of the other 60, about 1/3 of them are not used.

To "syntax check" I have also generated a bitmap for values from 32 (first character) to 204 (last token). This is a small array with 1 bit per value, set if this is syntactically legal, so if LEFT$ say is used, then this bit will be clear, so an error can be signalled.

So compilation is fairly simple really. Just those three steps above.  Additional code is required for data, back-origin, identifiers, constants, but these aren't really complicated.

So plan the first is to write a compiler in Python. Then a 6502 run time, with a skeleton to run the compiled code. Then see if the compiler will compile itself - if it starts at the same memory location, it should produce exactly the same binary code.

Tuesday, 8 October 2019

The compiler

The compiler is an odd thing.

Firstly, and I don't recall this anywhere else, it compiles backwards. Literally. When you figure out how it works, it actually makes a lot of sense though.

Secondly it's not really a compiler. It produces P-Code - a simple bytecode - from tokens. The sensible thing to do is to work out some sort of correlation between the tokens and the P-Code. So with the odd exception - the data operations, constants, identifiers, if/then/else all the compiler does is copy the input.

Backwards ;-)

It is also bootstrapping. So the compiler is written in the language. I don't know how the first compiler was created (I'd guess it was BASIC) but I'll probably use Python.

Friday, 4 October 2019

About the language

It starts with BURP.

Wireless world published a magazine article for a Z80 computer sometime in the late 1970s or early 1980s. What was odd about this is that it had a maths coprocessor - the MM57109.

Whoever wrote the BASIC for this had decided to offload the arithmetic onto the co-processor. But nothing else.

So it looked a bit odd. Sort of half and half. LET A = B C + 2 * sort of thing. But there wasn't much consistency. Comparisons were still done by IF A=4 THEN for example, when perhaps it should have been IF A 4 = THEN.

Interesting though. While tinkering around I came across something not dissimilar but slightly more consistent. A language called "RPL" written by a chap called Tim Stryker in the early 1980s, which ran on the Commodore PET.

This is a bit different. It's the same basic sort of idea, but it's more like FORTH.

Interestingly, it is integrated as a compiler into BASIC. So you edited it using BASIC and ran a (small) compiler. It was possible to use routines in BASIC.

It looks like FORTH in many ways, and not others. It has a slightly mangled syntax ; because you type it in in BASIC it has to live with the Microsoft BASIC Parser. It has "Goto" for loops. It also doesn't have any of the scaffolding stuff that you can do in FORTH - so you can't create words that compile - you'd have to modify the compiler to do that.

RPL/PET Manual

http://www.portcommodore.com/dokuwiki/lib/exe/fetch.php?media=larry:comp:flash_attack:fa-rplmaual.pdf

RPL/PET Discussion about it

https://portcommodore.com/dokuwiki/doku.php?id=larry:comp:flash_attack:flash_attack_notes

Tuesday, 1 October 2019

Introduction

Okay. Where do I start ?

This time round I decided to write a new language of some sort for the Commander CX-16. This, for those of you who don't know is a Retrocomputer designed by (among others) David "8 Bit Guy" Murray, and it's basically a souped up modernised C64. So it sort of counts as Retro (it's still got a 6502 in it) even though it's only a few months old.

I didn't want to duplicate anyone else's work or ideas. I wanted something that would be self hosting (if a compiler) and the option of writing the Compiler in 6502 assembler or bootstrapping it.

So I looked at various options for ideas, or even something worth copying

Pascal - there's a Tiny Pascal in Byte which I thought I could adapt and extend

Quick - Quick is an Atari High Level Assembler - part way between a HLL and Assembler.

65CM,C65CM, Pas64 - more High Level Assemblers, variations on the same theme.

Action, PL/65, Promal - various specific 6502 languages that resemble C type languages

Color FORTH - there are already several people wanting to do this, this is a bit different.

Stage 2 - some work by Professor Waite on Macro generation languages.

The 6502 has a lot of faults. Sometimes I wonder what's good about it other than it (was) cheap. The thing is, it really doesn't do these things that well. Especially when it's running the BASIC ROM from the C64 which grabs half of zero page. Some of my experiments ended up with a lot of lda xxx adc xxx sta xxxx all over the place, often with word addresses. I looked at some quite serious compiler projects and they tended to have the same problems.

I did look at byte compilers, there's quite a nifty one for the Vic20, but it's just too limited. That was where Color Forth came from, one way to avoid the load and save locals problem is not to have any., or not many. I looked at porting my Flat-Forth I created for the Z80, and never quite finished, but that didn't really work either.

Some work round it with Runtime systems, but then you lose quite a lot of the clout. Some have serious optimisers, which isn't ever going to fit on a 6502 machine really.

So I ended up with two options. One was Color FORTH. The other was a language as unnamed, called CX16-HLA. This is quite advanced (in the sense that the bootstrap generates code and it runs and I wrote some functions), and it sort of works round the load add save by having an 16 bit or 8 bit accumulator modelled around YA, and doing operations in 8 bit or 16 bit mode.

So it looked a bit like count + 1 - 4 ^ $4C => result type code with loops, ifs and procedures. Definitely better, but still not happy with it.

I very nearly did that - there's a working compiler, and there's a semi working Color FORTH as well. I might finish them later.

So, basically, I nearly abandoned it. Then in an odd musing about another system pretty much nobody remembers from the 80s and browsing about it, I came across something else, that gave me an idea.