Emulating a CPU in C++ (6502)

  • 🎬 Video
  • ℹ️ Published 1 years ago
preview_player
UCQcBTumGQK1Qqc7epDf19sQ

This isn't a full implementation of the 6502, this is more just a from scratch into in learning how a CPU works by writing an emulator one (in this case the 8-bit 6502).


Links:

Timestamps:
0:00 - Intro
0:29 - The 6502
4:24 - Creating CPU Internals
9:23 - Resetting the CPU
12:48 - Creating the Memory
15:10 - Creating the Execute function
23:32 - Emulating "LDA Immediate" instruction
28:00 - Hardcoding a test program
31:50 - Emulating "LDA Zero Page" instruction
37:20 - Emulating "LDA Zero Page,X" instruction
38:42 - Emulating "JSR" instruction
48:30 - Closing comments

💬 Comments
Author

BTW, the SP (stack pointer) should only be a Byte (8bits) not a Word (16bits)

Author — Aaronjamt

Author

How astonishing to find this YT suggestion ! I wrote a 6502/6503 emulator in 1987 in C on a PC-XT (8086). Both clocks of 6502 and 8086 were at 4Mhz. The emulation was 400 times slower than the real processor, but it was embedded in a debugger (MS C4-like) and it was possible to set breakpoints, survey memory values, execute step by step, a.s.o... Ahh ! nostalgia...

Author — Philippe LePilote

Author

If you use uint8_t and uint16_t for your CPU types (in <cstdint>) you make your code basically platform agnostic, now you depend on 16 bit shorts.

Author — ernestuz

Author

Did a FULL implementation back in 1986 in C to simulate machine tools controlled by a 6502. It was cheaper to test code on a PC before putting it into a tool than it was to put code on the tools and have it break something.

Probably would have been easier in C++ as you could model the various components of the CPU as objects.

Author — Tony Bright

Author

On a similar theme, back in the 1980's I wrote an assembler/disassembler pair for the Z80 microprocessor that ran on a Pyramid minicomputer. I used it to work out the full functionality of UHF radio scanner that had a Z80 and associated IO chips as it's central control. I dumped the radio's 16k byte EPROM into a file containing a long string of HEX pairs, disassembled it and printed the result. Then spent a few days looking at the printout and filling it with comments. Made my modifications, also adding all the comments to the disassembled program and used my Assembler to create a new HEX file ready for EPROM programming. Started up the radio and all my mods were working as planned. They were fun days. I doubt I could do what I did back then with today's systems.

Author — Dexxter

Author

You use the term "clock cycle" for what is actually a machine cycle. Early processors such as the 6502 required multiple clock cycles to execute one machine cycle. The 68HC11 for example needed 4 clock cycles for each machine cycle.

Author — etmax1

Author

08:30

You are missing the unused (expansion) flag of the 6502 in bit 5 between B and V (I'm assuming your compiler assigns the bits from the least significant bit 0 upwards).

Without this bit some processor status manipulations (such as PHP, PLA, play with bits, PHA, PLP) could fail as V and N would be stored in the wrong bits.

Author — Cigmorfil

Author

Memory-mapped I/O is still very much in use. A large amount of memory address space on a modern PC is used, e.g., by your video card, which is why 32-bit windows would only have ~3 GB available to applications on a system with 4 GB of RAM installed.

Author — cogwheel42

Author

To anyone thinking about coding their own...

Most processors, internally, use predictable bits of the instruction opcode to identify the addressing modes - because, the processor really needs to be able to decode opcodes fast, without having to 'think' about it! Understanding this strategic bit pattern can make writing a CPU emulator SO much easier!

It's been a long time since I coded for 6502 ASM ... but, if you were to plot each instruction in a table, you'd likely notice that the addressing modes fall into very neat predictable columns. This means that you can identify the 'instruction' and 'mode' separately, which then lets you decouple the Instruction logic from it's Addressing logic.

This 'decoupling of concerns' can really help shorten your code and reduce errors _(less code, as every Instruction-Type is "addressing agnostic" ... and less repetition, as each "Addressing logic" is only written once and is shared across all instructions)_

Just an idea for future exploration : )

Unfortunately, sometimes this bit-masking strategy isn't perfect, so you might have to handle some exceptions to the rule.


*My experiences, for what it's worth...*

Last time I emulated an 8-bit fixed-instruction-length processor... I wrote each instruction handler as a function, then mapped them into a function-pointer array of 256 entries. That way (due to ignoring mode differences) several opcodes in an instruction group all called the same basic handler function. I then did the same thing with the modes, in a separate array ... also of 256 entries.

So, every Instruction was invariably a call to : fn_Opcode[memory[PC]] ... using the mode handler : fn_Mode[memory[PC]]

That got rid of any conditionals or longwinded case statements... just one neat line of code, that always called the appropriate Opcode/Mode combination... because the two tables encoded all the combinations.

Hope that makes sense ; )

Obviously, to ensure that this lookup always worked - I first initialised all entries of those tables to point at the 'Bad_Opcode' or 'Bad_Mode' handler, rather than starting life as NULLPTRs. This was useful for debugging ... and for spotting "undocumented" opcodes ; )

It also meant I knew I could ALWAYS call the function pointers ... I didn't have to check they were valid first ; ) It also meant that unimplemented opcodes were self-identifying and didn't crash the emu ; ) As I coded each new Instruction or Mode, I'd just fill out the appropriate entries in the lookup arrays.


But the real beauty of this approach was brevity!

If my Operation logic was wrong, I only had to change it in one place... and if my Addressing Mode code was wrong, I only had to change it in one place. A lot less typing and debugging... and a lot less chance for errors to creep in.

Not a criticism though... far from it!

I just thought I'd present just one more approach - from the millions of perfectly valid ways to code a virtual CPU : )

Understanding how the CPU, internally, separates 'Operation' from 'Addressing' quickly and seamlessly... is damned useful, and can help us emulate the instruction set more efficiently : ) But, ultimately, you might have to also handle various "ugly hacks" the CPU manufacturer used to cram more instructions into the gaps.

By using two simple lookup tables, one for Operation and another for Mode ... you can encode all of this OpCode weirdness in a simple efficient way... and avoid writing the mother of all crazy Switch statements XD

Author — GaryChap

Author

Really nice to see how you've set this up. Clear coding.
Just wondering about the clock: When you model the system clock as a separate entity, it could remove the Execute method from CPU. Just let the clock run, optionally for a fixed number of ticks. Instead of the cumbersome counting of Cycles, subdivide each instruction in Steps and push those on an internal stack in CPU. That would make it possible to halt the CPU for inspection.
But yeah, it is not full implementation :-)

Author — Harry de Kroon

Author

Been looking for a series like this for years, bloody brilliant work mate! Keep it up!

Author — TheDarkSide11891

Author

Great video Dave, thanks for sharing. I did a similar thing nearly 30 years ago. Wrote a Z80 emulator in compiled Blitz Basic on the Commodore Amiga. Got it to load and 'run' ZX Spectrum snapshot game files. It could actually render the Speccy's screen but my god it was slow - took about 5 minutes to render one frame! A 0% useful but 100% interesting project!!

Author — Lee W

Author

when I was in college (1980s), the assembler class I was taking didn't include how to do output but instead had us dumping memory [to paper] and then highlighting and labeling the registers, and key memory locations. I do recall reading files at some point because, due to a bug, I corrupted my directory structure and lost access to my home dir. Thanks to a brilliant Lab Tech (he was like 14 or so and attending college), my directory was restored. I couldn't say if that was from a backup or if he fiddled with the bits to correct the directory but I'm pretty sure it was the former.

Author — Kevin Olive

Author

Very interesting. Earlier this year I was wanting to expand my knowledge of Java and went through a similar exercise. I had a Heathkit ET-3400A microcomputer a long time ago, and I wrote a functional ET-3400A emulator that runs the ET-3400A ROM in an emulation of a Motorola 6800.

Author — davesherman74

Author

Love this. When I was in University in the 1980's, we had to write a microcode engine to implement instructions for the 6809 and get a simple program to execute on it. We had to write the microcode for each instruction. We were given the microcode instructions for the RTL Register transfer language. You could create a microcode engine that could then run any instruction set on top of it! Set the microcode engine up as a state machine to make life a bit easier. At the time we were actually using an IBM/370 and the VM operating system so we each had our own virtual machine. but the microcode engine had to be writeent in 370/assembler and boot as a virtual machine on the mainframe! These days the average PC is capable of this with relative ease!

Author — P. Wingert

Author

I actually wrote a 6502 emulator in C on my Atari-ST (68000 CPU) in 1987. I was quite proud of it. It used a kind of jump table for the actual instructions. I made an array of pointers to functions, and used the content of the instruction register as an offset into this array to call each Op-code function. For example at A9 was a pointer to the LDA immediate mode function. I started off writing a cross-assembler, and then wanted to test the resulting machine code and so wrote the emulator for it. Amazingly, after all these years I still have the source code!

Author — Martin Stent

Author

29:00

$fffc is a vector, so if those bytes are loaded there the 6502 will load the PC with $42a9 and try to execute that memory, which contains $00 (BRK) at the moment.

Author — Cigmorfil

Author

Look like fun. Learnt programming with a 6502 and opcodes as part of my EE degree. Was thinking of replicating a 6502 on a nexys2 I've had knocking about for ages. Might do it in rust first.

Author — jonnoMoto

Author

I have a problem with the way the emulator appears to be trying to merges the emulation of multiple elements of a computer as one lump rather than as discrete components tied together by the master clock. To me, the bit of the code that is emulating the microprocessor ought to do only that. That means things like the registers of course, but it should not include the memory. That's strictly external, and what should be emulated is the state of the physical interfaces that allow memory to be read from and written to. What's just as critical, is that includes any I/O devices which are all memory mapped on a 6502. I would even expect and emulator to model the state of all the output pins of the 6502 chip at any point. By making the whole thing clock driven it means that there's a chance of it actually behaving like a real computer with proper time synchronisation. It's fine to have an emulator which runs faster than the real thing, but in a controlled way (fine clock control may not be completely possible with sub-microsecond timing but there are ways to deal with even that).

All the necessary discrete components of the computer ought to have their own definitions so that it's possible to build up a "virtual computer" that is made up of components. That should include and external clock, complete with phases, which runs as master loop to drive the execution of instructions and the interfaces to any other components, among which is memory. Memory is not part of a CPU definition. Instead, what is required is one or more discrete memory emulators which can be mapped into the address space. Memory isn't necessarily contiguous, and there are both RAM and ROM forms too. Then there are I/O devices, which on a 6502 are memory mapped, and those too ought to have their own discrete definitions. Things like keyboard interfaces, serial drivers, sound chips and so on. They don't, of course, all have to be there from the start but the ability to add them will make the whole model much easier to extend. As a guess, the starting position ought to include that master clock, the microprocessor emulation, RAM & ROM emulators and some initialisation meta-routine that allowed memory to be loaded with a program. I suspect some of these emulations, like RAM and ROM, will be very simple but it seems to me important that the CPU emulator ought to mimic the setting of the address, data and R/W pin and that this is what ought to drive memory reads and writes (as well as any other memory mapped device).

This doesn't mean that the whole thing must be produced at once, nor that the some short-cuts can't be used to mimic behaviour, but they ought to sit within the model of discrete components. For example, routine to populate memory with a program to be executed, but that would be outside the actual emulation routines themselves. That way it could be easily removed once a more general routine to load programs was introduced. For example, something that read in from an emulated disk or tape interface. Perhaps easier is an emulation of a cartridge being inserted and appearing as ROM (real 6502 computers often had paging registers to allow memory overlapped ROM cartridges).

The issue to me is that this is being thought through from the point of view of software design, and not that of an electronic engineer. If somebody wants an accurate emulation, then it's critically important that it should reflect the structure and interfaces of a real computer. Perhaps not as precisely as simulating periods of rising and falling clock and interface signals, at least for this sort of use, but at least something that models the working of a computer.

Author — Steve Jones

Author

You might wanna take a look at the <cstdint> header. It defines portable integer types of fixed width.

Author — Erf Unden