Writing My Own Assembler

Just completed a little project. Well, little by the standard of some development projects, but big enough to consume something like six months of my time.

I’ve just finished writing an assembler in Javascript, that runs in the browser.

Backstory as follows: Zilog, way back in the 1970s, developed a CPU chip called the Z80, which became immensely popular for several reasons. These being that:

[1] It was equipped with what was considered to be at the time, a beast of an instruction set;

[2] It was relatively affordable compared to some competitors;

[3] It was fully object code compatible with an operating system called CP/M, which had been written for the earlier Intel 8080;

[4] It could be integrated into a wide range of systems with relative ease.

The Z80 thus found its way into a number of popular 8-bit home computers in the 1980s - UK computer enthusiasts from the time will remember, for example, the Sinclair Spectrum and the Amstrad CPC series, while a number of Japanese companies brought out the various MSX computers based on the CPU. The chip also becampe popular in the world of embedded systems, and there are still quite a few legacy embedded systems running Z80 code today.

We now move forward to 2024. Zilog has decided to pension off the Z80 after a long life - the chip has been in circulation for no less than 46 years. Tens of millions of lines of code have been written for it over that period.

So, is the Z80 now dead?

Not exactly.

Zilog knew that there was a huge base of legacy code out there that still needed support. So, before retiring the venerable Z80 in its original form, they decided to keep the architecture alive by bringing out a modern version - the eZ80.

Not it so happens that the eZ80 is very definitely aimed more at the embedded systems market than at any home computing market, which in any case is pretty much monopolised by the x86 architecture and its 64-bit extension. Likewise, the ARM series of CPUs have pretty much sewn up the mobile phone sector. As a consequence, there’s a range of eZ80 CPU packages, which not only include a beefed up version of the original Z80 CPU, but various add-ons of the sort that the embedded market clamours for, such as dedicated high speed I/O devices on chip, timers, and various other useful bits and bobs.

So, I thought to myself, that since I’ve done some Z80 coding in the past, and Zilog have now brought out this new shiny eZ80 version, I’d have fun writing an assembler for it.

“Fun” turned out to be, well, an epithet with a chequered degree of application to this project. One hurdle that took time overcoming being the development of an expression evaluator, a necessary step in order to make the assembler genuinely useful, and handle operands consisting of mathematical expressions instead of simple numeric values or register names. JavaScript actually has a built-in function for this purpose, namely eval(), but use of this is very strongly discouraged in development circles, because it’s a massive security hole if used in a project - it can be hijacked by malware with almost embarrassing ease.

So, building an expression evaluator that wasn’t a massive security hole was a priority. That chewed up over a month on its own.

Then, came the fun of writing the assembler proper, making it not only compatible with legacy Z80 code, but making it compatible with the new, shiny eZ80 and its extensions. Which include, for those familiar with the old Z80, extending the address space to 16 megabytes, extending the register set to include optional 24-bit register sets, adding new instructions, and allowing the CPU to switch back and forth between legacy Z80 mode and new, shiny eZ80 mode at will.

Zilog, bless their little cotton socks, provide a full manual for the instruction set, allowing anyone wading through it to write their own assembler, But, er, the manual is a bit on the large side. It’s also terse and dense, written with seasoned system developers in mind. Not for the faint hearted.

But, after various struggles, the final project works. Nearly 20,000 lines of JavaScript code, if you include all the custom support libraries I wrote for other projects, which were also useful here, but the BIG file is the actual assembler itself - a whopping 16,248 lines. Debugging this has taken some time, as you can imagine. But now, it’s finished!

Oh, if you want to download the shiny new eZ80 manual from Zilog, you can find it as a downloadable PDF file here. All 411 pages of it. Then you can have fun imagining the hilarity I was involved in, wading through this to build my assembler.

More on this topic to follow after I’ve taken a break!

2 Likes

Recursive descent parsers are a simple way to deal with such expressions, and they should be quite fast to hack together.

Actually, that kind of parsers are quite fun. As an undergrad, I took on the self-imposed challenge to program a compiler over summer vacation. It compiled a subset of Basic into assembly code for the Motorola 68000 processor, and I let the system assembler take it from there, to make a working executable binary. Due to time constraints, I only implemented integer arithmetic, leaving out floating point. I also left out an optimiser stage, as that would have made stuff much more complicated. I based the design on a basic skeleton of a recursive descent parser that I stitched together from selected chapters in the Dragon book. The resulting compiler worked, but the generated code was not very efficient and the supported language was quite primitive. But it worked, it was mine, and I wrote it. Yay.

Circa 1980 I wrote an assembler for the Z80. In FORTRAN.

1 Like

GADS ! I’m impressed. I did a lot of Z80 assembly programming and wrote a BASIC interpreter for it back in the late 70’s. the Z80 was SO much better than the 8080 series, but it largely fell by the wayside when Gates wrote DOS for the 8086 series. CPM was also way better than DOS, but when IBM promoted the 8086 and DOS, it was pretty much all over for CPM and the Z80.

Bill Gates did not write MS-DOS. It was written by Tim Paterson at Seattle Computer Products and bought by Microsoft (for $25,000).

In what ways was CP/M better than MS-DOS? MS-DOS was a close clone of CP/M so I fail to see how CP/M was “way better” than MS-DOS.

Of course you are right. Gates was smart enough to sell IBM on it.

I have forgotten WHY I am left with the impression that CPM was better than DOS. Perhaps I just had a preference for it. It may just have been an impression left by the CPM/Z80 as opposed to the DOS/8086. I haven’t done any of that for many decades and I’m old and my memory is not what it used to be**. That’s my story and I’m sticking with it

**To be fair, my memory never was what it used to be.

And wasn’t it originally called Dr. DOS?

No, it wasn’t. DR DOS was Digital Research’s (hence the “DR” in its name) version of DOS that they released in the late-80s. It was a rename of CP/M-86.

When the IBM PC first came out in 1981, you could get PC-DOS or CP/M-86 for it, but PC-DOS (IBM’s branded version of MS-DOS) cost only about 1/4 what Digital Research charged for CP/M-86. As you can guess, customers opted for the cheaper PC-DOS in overwhelming numbers, and the rest is history.

1 Like