Welcome. This is a guide meant to teach 65c816 Assembly to people that already know a thing or two about ROMhacking. If you have some experience hacking, and you want to learn ASM with Super Metroid as a context, this is the guide for you!
You will need:
NOTE: The lessons on this wiki page are edited versions of the IRC logs, and are to be edited further over time for easier reading and understanding. The original logs are available within the .zip files for each lesson.
ASM is neither very easy nor very hard. The lazier you are, the harder it gets. Furthermore, ASM is knowledge, NOT skill! The more you know, the more you can do; it's that simple.
Let's start our first lesson: Learning basics.
You can create a lot of new code for some pretty cool things with just simple, basic ASM. First, we must go over just how the RAM works in SM (and basically everything else computer-related). RAM is short for Random Access Memory; it is a section of memory, much like ROM, but it's not 'physically' there. It contains all kinds of values, pointers, etc. that the game uses during gameplay; anything the game reads and might need to read later is stored in RAM. For example, there is a RAM location for Samus' current energy. Without that RAM location, the game can't keep track of how much energy you have throughout the game.
In this case, the RAM is bank $7E, which can't be accessed externally (meaning you can't open it with a hex editor). Bank $7E exists only in the hardware, which is unviewable without a system that allows you to view its RAM while running a ROM.
At this point, we'd like you to open Kejardon's RAM map. At first you'll be like 'WTF is this…?' It looks very messy at first, but as soon as we tell you how it works, it'll be no problem. Look at the values on the left side; these are addresses in the RAM. Many of these will make no sense at all just by looking at them. The easiest ones to look for are around $7E:09A2, so do a search for $7E:09A2 (or just scroll down a bit). You'll find this:
7E:09A2 - 7E:09A3 Equipped items
The 2 bytes at $7E:09A2 keep track of every powerup you have equipped, while the 2 at $7E:09A4 keep track of every powerup collected. Why are there 2? Because the game needs to keep track of everything you have picked up, but also needs to check whether or not you actually have that item equipped.
“Isn't 2 bytes is a really small space to keep track of all the items you have equipped/collected?”
True that… But, those 2 bytes are made up of a lot more bits. If there are 2 bytes, there are 16 bits. (Note: $ before a number is used to indicate hexadecimal.) $00 is zero; $04 is 4 in both hex and decimal. But in binary, $04 would be 0100. Counting in binary, it goes 0000, 0001, 0010, 0011, etc. Since each item can only ever be 'equipped' or 'not equipped' we can give each item its own bit.
Bit masking is great; it means you can easily assign a single number to any possible combination of items. Say bit 3 is reserved for screw attack, bit 2 is reserved for Morph Ball, bit 1 is reserved for Spring Ball, and bit 0 is reserved for Varia. Then, when you have Morph ball equipped, the value would be 0100 ($04). If you have Morph Ball AND Varia, it would be 0101 ($05). Those four binary bits can be set in all possible combinations with just half of a byte. We have 2 bytes to use, which is 16 binary bits, so there can be a number for every possible on/off or equipped/unequipped item combination using those 16 binary bits, which, however they are arranged, can be represented with just 2 bytes. The same theory applies to beams at $7E:09A6. Even though there are a lot less beams, there are still 16 bits to use.
Anyway, if you look a bit further down the RAM map, at $09C2, you'll find Samus' current health. (Note: 0x before numbers is also used to indicate hexadecimal. It's a context thing.) If Samus has 01 energy, those bytes will be 0x0001. If Samus has 99 energy, those bytes will be 0x0063 (remember the RAM counts in hex!), etc. The address directly after that, at $09C4, keeps track of Samus' MAX energy. That value will never be less than 0x0063 in the original ROM, because you can never be maxed at less than 99 energy. It's also safe to assume that the value at $09C2 will never be greater than the value at $09C4.
“So, max is 1499… That doesn't include reserves, I presume, since they're below?”
That's correct. Now, if we wanted to create a code that refilled samus' health, we would load the value from $09C4, and store it to $09C2. That's all; that would completely refill Samus' health. The thing about RAM is, it not only stores the status of the game, but it affects it. If you write to RAM while the game is running, it will affect the game, because the game also reads from RAM instead of just storing things there. This is the basis of ASM; we edit the values in RAM to create the effects we want in the game.
All right, time to start writing basic code!
Opened that and had a look? Basically, LDA means LoaD into Accumulator, and STA means STore Accumulator to [address]. RTS is ReTurn Short. The accumulator (A for short) is a hardware device that handles numbers and values. What the Example 1 code does, is it loads the value currently at $09C4 into the accumulator. The accumulator now holds Samus' maximum energy. Then, we store the value in the accumulator to $09C2, which puts Samus' max health into her current health, which restores her HP. The RTS simply tells the game to end that particular routine, and return to the routine that was running previously.
Also, we tell the game to load a value in two ways: immediate (direct value) and relative (from an address). For instance, immediate:
Notice the prefix '#' before the dollar sign. This means we now have the direct value of 0x000A loaded to the accumulator (10 in decimal). You see where I put LDA $09C4; that will load a value from the address $7E:09C4. If the value at that address is #$0100, then A will hold #$0100. But, if I had put LDA #$09C4, the value in A would be #$09C4 (which is very high…). Also, this would always store the value #$09C4 into the current energy. Now, if we wanted to give the player #$0050 energy, regardless of how much energy the player had:
I probably should've mentioned… The 'org' just tells Xkas where in the ROM to write this ASM, so you won't need to worry about that for now. If you compile it like it is now, this ASM will run whenever you touch a Fool Xray Air block with a BTS of 03. Xkas commands like 'DW' and 'org' are explained more later.
Okay, question: how would you restore your missiles instead of energy? Look at the RAM map and try to answer before reading on.
“LDA $09C8 and STA $09C6”
This is done exactly the same way if you want to restore missiles, supers, power bombs or reserve tanks. Just LDA/STA the max/current values for those accordingly. All right, time to start explaining how to compile your code.
Okay, let's write an .asm file! I prefer changing the file type with windows, so please make a new file called Test.asm. Also make a copy of your clean ROM (unheadered is best) and rename it Test.smc. When you're done, open the .asm file. Copy the code from Example 1, paste it into your Test.asm file, and save. Now, if you drag the .asm file over the Xkas.exe, the code should be automatically compiled to the ROM. To test if the code works, use SMILE to place an Air-Fool X-ray tile with BTS 03 somewhere (you should be able to touch it in-game) and quickmet. Do a shinespark or anything else to lose a bit of your health, then touch the block. It should restore your energy; does it work properly?
All right, now i have your first task. It's simple though, so don't freak out.
See if you can make a block that fully restores your missiles, super missiles, and powerbombs, then sets your energy to 1.
Class run by Scyzer.
Today, I'm just gonna go through the branching opcodes, which are used to make variable code. Got your RAM map open?
First, we're gonna learn about conditional branching, which is done by comparing values at addresses with other addresses or immediate values.
As you can see, it's the same basic energy-restoring routine from last lesson, but this time it checks to see if Samus's energy is already full. It loads Samus' max energy, then compares that value against Samus's current energy; if the two values are equal, the routine branches to the end. This is basically what conditional branching is all about. There are also a handful of other branching options, the most common ones being BNE, BPL, BMI, and BRA. Keeping in mind that BEQ stands for Branch if EQual, do you have any idea what the others might mean?
“Branch never, branch plus, branch minus, branch always?”
Almost… BNE actually means Branch if Not Equal; there wouldn't be any point in having a branch never. Now, branching basically just lets a routine skip parts of code, or run other code, if certain conditions are met. Take the energy warning beep for example; it runs constantly through the game. The reason the game doesn't beep constantly is because the game checks Samus' energy and compares it with 0x001E (30), and if it is less than that, the beep activates. Any questions?
“What's the point of a BRA when you can JSL/JSR to the code you want instead? Is it just a space-saver?”
Sort of. JSR and JSL take the game to a subroutine, which means that upon hitting RTS/RTL, the game jumps back to the instruction straight after the JSR/JSL. BRA works similarly to a JMP; when the game hits RTS/RTL, it doesn't jump back to the instruction after the BRA, but rather the routine that led to the one it's currently at. I'll make a little ASM to explain better.
If we follow the JSR SUBROUTINE path, the game enters the routine normally. It checks Samus' energy and compares it to her max energy; if the two are NOT equal, then the routine branches across to refill it, and then ends the routine. Then, game would then jump back to the instruction straight after the original JSR, which is another RTS. Now, let's assume her energy was full. The game compares them, and they're equal, so it doesn't branch. Now it hits a branch always, which means we always go to the subheading MISSILE. That just refills Samus's current missiles, then RTS. That RTS doesn't lead us back to the BRA, but rather, the JSR we were at before. Make sense?
Now let's follow the JMP SUBROUTINE path. This still takes the game to the next routine, but upon hitting an RTS, it doesn't lead back to after the JMP; it ends the routine completely. JMP and BRA merely jump to different areas in the same routine, without entering into a subroutine. JMP is far more useful than BRA, but since BRA is only 2 bytes (80 XX, where XX is the number of bytes to jump), it saves space. JMP can literally jump anywhere in the ROM, without hassle, while BRA is limited to 127 bytes forwards or backwards.
Okay, now what I want to show you is how to to refill energy over time, using a timer. I'll show you 2 new instructions: INC and DEC.
Now things become a bit more complicated. The routine checks to see if Samus has full energy first; if she does, the routine ends and that's it. If she doesn't, then a “timer” is loaded into A from $05D5. If you look at the RAM map, you'll see that this is just an address used during debug only and is otherwise unused, so we can use this location as a counter. So, what happens is, the value at $05D5 is loaded and compared with #$0010. If it is equal, then the routine branches, increases Samus' energy by 1, and stores 00 back to the counter; this is so we can start counting again. If the counter wasn't at #$10 yet, then the routine simply INCrements (increases) the value in A and stores it back to $05D5. Since this routine is run every frame, Samus' energy would be increased by 1 every #$10 frames. The other frames simply increase the counter, and the 16th frame clears the counter and increases her energy. You could really set it to any number of frames you want (16, 20, 5, etc.) just by changing the CMP #$0010. Change it to #$0005, and your energy would be increased every 5 frames. Understand?
This is homework. Restore Samus' energy every 5th frame, at the cost of 1 missile every 20th frame. If you run out of missiles or have full energy, do nothing.
“I have another question, if you don't mind…”
“About JMP - does it have to specify bytes, or can it point to a specific place in the ROM?”
JMP specifies an address. BRA is relative, while JMP is absolute. This is what gives JMP that freedom. If you use BRA, you'd put something like BRA $10 to jump ahead #$10 bytes; if you use JMP, you would JMP $8000 to jump to $8000 in the current bank, or you can specify a bank like JMP $8FFB00.
Class run by Scyzer.
Okay, this time I'm gonna get into selective bits (useful for checking for specific items, events, etc).
Basically, each address in the ram is split up into bits. 1 byte contains 8 bits; FF in hex is equal to 11111111 in binary. Single bits are pretty much only used as flags, rather than having all 8/16 work as a whole value or counter. If you look in the RAM map at $09A2 (equipped items), this location is actually just 16 bits (2 bytes) used as flags for each item being equipped. Since an item can only be on or off, each item is assigned a single bit, which is either on or off. Using these bits, EVERY combination of equipped items can be represented with just 2 bytes.
Here's an example of checking for a specific item. Again, this is for fool xray BTS $03. This code loads the equipped item bits, then tests for a certain bit (in this case, bit 1, which is used by the Varia suit). The next part confuses most people; BEQ and BNE are used when testing for specific bits, but they are used in reverse. Branch Not Equal will branch if the bit came back true (the bit was SET), whereas Branch EQual will branch if the bit was false (RESET, or 0). Let's assume that Samus has Varia suit (0001), screw attack (0008), and morph (0004) equipped. LDA $09A2 would yield the bits of 0000 0000 0000 1101. 1101 = #$000D. Therefore, the value at $09A2 would be #$000D. Now, if we test for Varia against these equipped item bits:
0000 0000 0000 1101 0000 0000 0000 0001 = 0000 0000 0000 1100
Since this “test” result is NOT equal to the original loaded value, a BNE will branch, whereas if the bit were not set originally, the value would be the same. Thus, a BEQ would branch, while BNE would not. Does this BNE/BEQ reversal make sense?
Now I'm going to get into turning specific bits on or off. Let's say you wanted to make a block that gives you an item, or make a barrier that removes or unequips items. If you only want to take or give a certain item, you'd use the opcodes AND or ORA respectively. AND is used to turn specific bits off, while ORA sets them. If those bits were already off/on, the instruction would do nothing. With these 2 instructions, though, it is important to note that while they are opposites, they use the value given with the opcode differently; AND will turn off any bit that is NOT set with the argument, whereas ORA turns on bits that ARE set by the argument. For instance, if we LDA a value:
LDA #%1111000010101100 AND #%1111111100001000 A = #%1111000000001000
In simple form, the result is that only bits that are set in BOTH values will return true, otherwise they will be reset. Another example:
LDA #$FF0F AND #$F0F0 A = #$F000
The other instruction, ORA, sets all bits given by the argument:
LDA #%1111000010101100 ORA #%1111111100001000 A = #%1111111110101100
LDA #$00F0 ORA #$F000 A = #$F0F0
I'll write up some examples.
That example creates 2 blocks. The first (BTS $03 fool xray) will give and equip the Varia suit when touched. The second (BTS $04), when touched, will unequip Gravity suit. Although, the only problem with equipping/unequipping suits like this is that Samus's palette doesn't refresh until she changes palettes normally (by screw attacking, speedboosting, charging, etc.)
Well, i think that covers the simple opcodes for selective bits.
“This was a short one, though there's not much to be done with just checking for bits without doing anything else, most likely something completely unrelated to selective bits XD.
Just some notes about the log and asm.
You can use SMILE to find specific bits used by items. It is the “value” of the item in the PLM editor/selector when you right click on a PLM.
$ is used to denote a hexidecimal value. ie $F000, $EA, $1034
% is used to denote a binary value. ie %1100110011001100, %0010111101110001. Xkas automatically transcribes binary values to hexidecimal when compiling into the rom
# is used to indicate an immediate value, or number value. Without this symbol, any of the above would be read as an address.
AND #$FFFF does nothing
ORA #$0000 does nothing
BEQ will ALWAYS branch after a BIT #$0000
BNE will always branch after a BIT #$FFFF, UNLESS the value being tested against is #$0000”
I do realize that this has pretty much zero usefulness in any game, but it's good practice.