User Tools

Site Tools


super:expert_guides:asm_stylesheet

Super Metroid Assembly Style Guide

A guide that helps you correctly writing and organizing your ASM code to optimize your workflow and keep you motivated! :)
Please note that this is NO ASM TUTORIAL!.
You have to know how ASM code works and how it is stored in the game's ROM in order to make use of this guide. Also don't take the example code I used too seriously, it's just for educational purposes, so it doesn't matter if it makes sense or not.

INTRODUCTION

ASM coding became incredibly popular over the last years. I can very well remember the old vanilla hacks that were amazing long ago at the time when asm wasn't really a thing. Nowadays the amount of patches and asm resources has grown considerably, and every hack contains at least some hex tweaks or a new HUD. Naturally, the more ideas are presented in a hack, the more people want to come up with new ideas and code their own thing. The way these people write their code may be very different from each other, resulting sometimes in inefficient or just cluttered looking code that is not really viewer-friendly.

The goal of this guide is to give ASM coders an idea about how I think organized code should look like and how it helps you optimize it. Especially if you're going to release the source later, so other coders have to understand what's going on from a different perspective, which is not an easy task. Now you should know that this is by far no “official way to write ASM code”, you are free to do whatever you think is best.
I too am open for suggestions and improvements.
In this guide I will as well cover and if necessary expand ways to improve your workflow in regards to Super Metroid ASM coding, including what tools I'm using and how I disassemble large portions of code in a matter of minutes, xkas friendly using the tools listed below.

Anyway this introduction has been a bit long so let's get started!


1. Getting The Right Setup

Essential stuff any ASM coder should have before even reading this guide:

  • SMILE - You might be thinking, “Well, duh..”, the feature essential here is the file comparison.
  • Lunar address - If you don't know what this is for, then you probably shouldn't be here. What is a bank and what is a pointer? GO LEARN!
  • xkas v0.06 - An extremely useful program for writing and maintaining ASM code, also used to create and apply patches.
  • Debugging Emulator - bsnes-plus or Geigers SNES9x Debugger - Geiger's v1.51 may have some minor issues (mostly graphical), bsnes-plus is more recent. Both are highly useful for disassembling code, take your pick.
  • Hex Editor - HxD recommended, also has file comparison if you prefer this one.
  • Calculator - Anything that can calculate and convert between 0x <> 0d <> 0b

Tools recommended that once used you don't want to miss anymore:

Even more essential stuff and the reason I get out of bed every morning:

  • Coffee - Needed in order for programmers to convert it into code.

Other things:

  • A Brain - Well, not quite as essential maybe, some people don't have one, so you don't necessarily need it.

Setting up Notepad++

N++ is one of, if not the most versatile and popular code viewers out there. It supports lots of different programming languages such as batch or C++. If the language you need is not supported, you can simply define it nice and easy using the custom language feature. Lucky for you I already made one for SNES ASM so you don't have to bother.
To add the ASM language stylesheet, open up N++ and go to Language > User-defined (at the very bottom) Alternatively you can simply click the toolbar icon for it (it's a window with a lightning drawn into it)

In the opened window, click Import and select my stylesheet:

Now the default theme has a white background, which is a pain to the eyes (that's why I use an all black background) To change that, click Settings » Style Configurator, and under Select Themes, select a dark theme you like (I'm using black board) If you're still not happy with the slight blueish background or whatever color your theme's bg has, under Global Styles » Default Style » background color you can change it to black. Try to mess with it a bit, and when you're comfortable, let's continue with Part 2 of this guide!


2. Basic code structure

One Event/Statement Per Line

The following example shows how I used to write code when I just got started:

LDA #$E17D	;check if elevator bit is set
STA $099C	;code to run
LDA #DOOROUT	;label
STA $07B5	;doorout!!!
LDA !value	;would be value 1
ASL a		; 2 	;doorout pointer will be my custom code (!doorout) 2 + DOOROUT
ADC $07B5
TAX 
LDA $8F0000,x	;loads Door Definition code pointer
STA $078D	;stores the pointer here
LDA #$7777
STA !flag2
LDA !Istay
STA !nextinst
LDA #$000A	;09 = execute $099C
;STA $0998	;also initiate room transition
RTS

… and this is a good example of how not to write your code. You can see it takes up lots of lines and overall is not very nice to look at. So instead of using one opcode + argument per line, try to use one event/statement per line.

Now what's a statement?


A statement is a combination of opcodes and arguments, which creates either a check, performs an operation or causes any kind of change to a value and things such as RAM adresses and hardware registers.


Most statements consist of three steps: reading » calculating/masking » writing/checking

an example statement would be an addition operation:

LDA $09C2 		;reading
CLC
ADC #$0001		;calculating
STA $09C2		;writing (simple, nuh?)

Now this can be organized to have the operation on a single line using colons as separators:

LDA $09C2 : CLC : ADC #$0001 : STA $09C2

“What, you can actually do that?!” Sure thing you can!

Unorganized check statement(also referred to as conditional branching):

LDA $09C2		;reading
AND #$FF00		;masking
CMP #$0E00		;\checking (I think you get the idea)
BEQ Branch		;/

Organized check:

LDA $09C2 : AND #$FF00 : CMP #$0E00 : BEQ Branch

Now what if there's a combination of the two like the one below?

LDA $09C2
CMP #$0E00
BEQ Branch
LDA $09C2 
CLC
ADC #$0001
STA $09C2
Branch:
LDA #$0E00
STA $09C2

Easy, just remember one statement per line:

	LDA $09C2 : CMP #$0E00 : BEQ Branch
	LDA $09C2 : CLC : ADC #$0001 : STA $09C2
Branch:	LDA #$0E00 : STA $09C2

Notice that the branch labels can be put into the same line right before the code without looking cluttered.
Though please avoid merging multiple statements all into one line like this:

        LDA $09C2 : CMP #$0E00 : BEQ Branch : LDA $09C2 : CLC : ADC #$0001 : STA $09C2
Branch: LDA #$0E00 : STA $09C2

Other examples of statements/events:

  • Loading an immediate value followed by a jump to a subroutine that uses it (and optionally gives output based on it):
LDA #$0001 : JSL $8081DC : BCS Branch
  • Loops:
	LDX #$000A
back:	LDA $12 : CLC : ADC $07A5 : STA $12
	DEX : CPX #$0000 : BPL back

Three statements make up this loop: Initializing X, calculating/changing an address value, and increasing/checking X), thus I used three lines.
Now there are some special cases of statements, especially when it comes to checking several bits in a respective order like button inputs or items:

LDA $09A2
BIT #$0020
BNE GRAV
BIT #$0001
BNE VAR
BRA OUT

For this there are multiple ways, one is using tabs/spaces for alignment:

LDA $09A2 :	BIT #$0020 : BNE GRAV
		BIT #$0001 : BNE VAV
BRA OUT

Another way would be to simply give LDA $09A2 its own line:

LDA $09A2
BIT #$0020 : BNE GRAV
BIT #$0001 : BNE VAV
BRA OUT

Simple return statements (RTS, RTL) should have their own line, except if used with carry operations:

CLC : RTS
SEC : RTL

Status register changes (SEP,REP) as well as conditionless jumps (BRA, JMP, JML) should each be on a single line:

SEP #$30
LDX #$0A : LDA #$04 : STA $0000,x
REP #$30
BRA OUT

Stack operations should be put together onto a single line, for easy recognition of the order they are in (wrong order of pushing onto/pulling from stack is the most common reason for emulator crashes!):

PHX : PHY : PHP
REP #$30
[code]
PLP : PLY : PLX	;always have to be pulled in the reversed order in what they were pushed
RTL

Block movement organization:

PHB
LDA #$00FF : LDX #$8000 : LDY #$1000 : MVN $897E
PLB
RTS

Branches

There are three ways to define a branch in xkas that in the end do the same thing.
The first one requires you to know how many bytes an opcode's argument takes up (if it is using any) plus one (the opcode itself), also paying attention to 16/8bit mode:

REP #$20
LDA $09A2 : AND #$0001 : BNE $02
BRA $06
LDA #$0020 : TRB $09A2
RTS

But as you can see, this is quite awkward and would not really make sense, since xkas does calculate the pc offset for you using labels.
Also actually seeing the branch destination is always better, right? So the second way would be to simply be using labels to name the branches:

	REP #$20
	LDA $09A2 : AND #$0001 : BNE VAR
	BRA OUT
VAR:	STZ $09A2
OUT:	RTS

Alright, up to this point, that should be common knowledge, even for ASM beginners. Now if you have a ton of code and are tired of naming all your branches, there's a solution!
Xkas has a nifty feature that lets you use sublabels (+/- signs) for branches, depending on whether the branching direction is positive or negative.

"Sublabels allow you to reuse redundantly named labels such as loop, end, etc. without causing duplicate label conflicts.
A new sublabel group is started immediately after a label is declared automatically.
A +/- label can be up to 3 levels deep, e.g. +, ++, +++, -, --, ---.
They overwrite their pc offsets immediately after being redefined.
Useful for very short loops, when even something like .loop would become redundant in a long routine."

This is a quote from the xkas help file. (which you should have read! Seriously if you didn't, just take a look, it's the .html file that's included in the xkas zip file!)
Now what does that mean exactly?

	LDA $12 : CMP #$000A : BEQ Go
	LDX #$0009						
Back:	LDA $12 : CLC : ADC #$0001 : STA $12
	DEX : CPX #$0000 : BPL back
Go:	RTS

Let's assume you're using the above kind of loop several times.
It still would require you to think of appropriate and unique label names all the time so you don't get things confused. That wouldn't be the only issue, though.
You'd have to write the same label for each branch if your routine checks for a lot of things and branches to “GetOut:” like 20 times.

This is the loop example, but this time using sublabels:

	LDA $12 : CMP #$000A : BPL +
	LDX #$000A						
-	LDA $12 : CLC : ADC #$0001 : STA $12
	DEX : CPX #$0000 : BPL -
+	RTS

Okay, now I'm perfectly aware that using them in a single loop is kinda lame and doesn't convince you, but what would you say if you can use + and - over and over again? Like this:

	LDA $09C2 : CMP #$0010 : BPL +		;this branch...
	LDX #$0010						
-	LDA $09C2 : CLC : ADC #$0002 : STA $09C2
	DEX : CPX #$0000 : BPL -
+	LDA $09C6 : CMP #$000A : BEQ +		;...goes to here
	LDX #$000A
-	LDA $09C6 : CLC : ADC #$0002 : STA $09C6
	DEX : CPX #$0000 : BPL -
+	LDA $09CA : CMP #$1FFF : BEQ +		
	LDX #$1FFF
-	LDA $09CA : CLC : ADC #$0005 : STA $09CA	;whereas here is where...
	DEX : CPX #$0000 : BPL -			;...this branch goes to
+	RTS

Awesome, isn't it? Okay this criss-crossing might look confusing at first, but try to understand where each of the branches go to.
Now what does “up to 3 levels deep” mean? It means that there can be multiple branch levels, so you have to use just as much levels for the sublables:

	REP #$20
	LDA $09A2 : AND #$0001 : BNE +			;branch 1, level 1
	BRA ++						;branch 2 (located *before* destination 1 , thus level 2)
+	STZ $09A2					;destination 1
++	RTS						;destination 2

Note that branch 2 comes before the destination 1 meaning in order to actually assign branch 2 to destination 2, you have to use ++ (level 2) This is regardless of whether destination 2 is located before or after destination 1!

The same counts for - signs, and for going back in the pc count:

	LDY #$000A
--	LDX #$0005							;destination 2
-	LDA 1000,x : CLC : ADC $8000,y : STA $1000,x	;destination 1
	DEX : DEX : CPX #$0000 : BPL -			;branch 1
	DEY : DEY : CPY #$0000 ; BPL --			;branch 2

Please note that you can't mix the use of labels and sublabels, like this:

	LDY #$000A
-	LDX #$0005							;destination 2
back:	LDA 1000,x : CLC : ADC $8000,y : STA $1000,x	;destination 1
	DEX : DEX : CPX #$0000 : BPL back			;branch 1
	DEY : DEY : CPY #$0000 ; BPL -			;branch 2

xkas will give out an error, saying negative branch too long, as it isn't able to calculate the location for - due to interference with a normal label. (At least as of version v.06, dunno if this has been fixed already)


Folding Blocks Of Code

You can use the brackets/braces { } to collapse code by clicking the [-] box on the left. Xkas will ignore them, so you're free to use them anywhere.

Example:

JSR $A4D6
{ ;this line will be the only one left visible if collapsed
	ASL A : TAX
	LDY $A4EB, X : BNE BRANCH
	SEC : RTS
BRANCH:	LDX #$00E2 : LDA #$000E : JSL $A9D2E4
	CLC : RTS
}

In N++, it'll look like this:

Collapsed:

Not really much to say, except you should have as many open brackets as closed ones, else N++ won't collapse them correctly. By default, if you open a asm file in N++, everything is unfold. To fold/collapse all the text blocks, goto View > Fold All

Commentary

Commentary is very important for getting and keeping a better understanding of what's going on. But it's easy to cross the line and soak your code with unnecessarily detailed comments about things.
It helps just about as much as having bareboned code without any extra word, no one's gonna read anything. However since I showed you the concept of one event/statement per line, it'll actually prevent you from doing so.

This code has too much info, which is only useful for absolute ASM beginners:

LDA $09A2	;Load item address
AND #$0001	;mask varia bit
CMP #$0001	;check if it is set
BEQ VAR	;branch if varia is equipped

Using the one-statement-per-line rule:

LDA $09A2 : AND #$0001 : CMP #$0001 : BEQ VAR	;check if varia is equipped

You notice that it doesn't need that much info, a simple comment about the event/statement itself is totally sufficient, but still better than none.
If you don't know how specific opcodes work exactly, feel free to ask, that's what Metconst is for after all!
Now again you don't HAVE to use comments, only if you feel things need a short explaination.

3. Organizing Xkas Commands

In the ASM stylesheet, xkas commands are blue, operators are teal (or red in previous versions).

Org and Data Commands

Most commonly used commands are DB (Data Byte), DW (Data Word), DL (Data Long) and org (sets file position) org along with an optional label should always be put onto a single line, and if necessary be on the same line after the open brace to collapse the code that is following afterwards.

Example:

{ org $808000	;again, this line will stay visible even when collapsed, use this as a "bookmark" to easily find certain chunks of code.
DW $0000 : DW $0001 : DL $7E8000 : DB $FF	
}

Example using labels:

{ org $BF8000 : TABLE:		;note the colon between org and label

DW $0001 : DW $0002 : DW $0004 : DW $0008
DW $0010 : DW $0020 : DW $0040 : DW $0080
DW $0100 : DW $0200 : DW $0400 : DW $0800
DW $1000 : DW $2000 : DW $4000 : DW $8000
}

Label Definitions

Defines are awesome and very helpful, though only if organized properly. This is how I used to write label definitions:

!input = $8B
!frameinput = $8F
!up = $09AA
!down = $09AC
!left = $09AE
!right = $09B0
!speed = #$0006
!jump = $09B4
!timer = $072D
!EnemySizeX = #$0020
!EnemySizeY = #$0010
!SMILEspeed = $0FB4
!SMILEspeed2 = $0FB6
!bomby = $0B82,x
!samusy = $0AFA
!pby = $0CE4
!projectiley = $0B78,x

Having them like this is especially bad if you have a large file with tons of asm defines, so try to categorize and visually separate them:

{ ; LABEL DEFINES ======================
	{	; BUTTON INPUT -----------------
	!input = $8B : !frameinput = $8F
	!up = $09AA : !down = $09AC
	!left = $09AE : !right = $09B0
	!jump = $09B4
	}
	
	{	;IMMEDIATE VALUES---------------
	!speed = #$0006 : !region = #$0005	
	!EnemySizeX = #$0020 : !EnemySizeY = #$0010
	}
	
	{	;SMILE ENEMY EDITOR ------------
	!SMILEspeed = $0FB4 : !SMILEspeed2 = $0FB6
	}
	
	{	;Y POSITIONS--------------------
	!samusy = $0AFA : !bomby = $0B82,x
	!projectiley = $0B78,x : !pby = $0CE4
	}
	
	{	;MISC --------------------------
	!timer = $072D
	}
}

In N++:

Repeatitions

Repeatedly used opcodes:

Instead of writing a bunch NOPs and ASLs all over the place like this:

NOP : NOP : NOP
NOP : NOP 
NOP
NOP
LDA $07A5 : AND #$00FF : ASL : ASL : ASL : ASL : STA $0AF6

…you can let xkas create pseudo opcodes using # and numbers (always decimal):

NOP #7						;this equals NOP : NOP : NOP : NOP : NOP : NOP : NOP
LDA $07A5 : AND #$00FF : ASL #4 : STA $0AF6

But be careful at status register (SEP,REP) and 8 bit operations to not forget the '$' sign:

SEP #$20
LDA #$10 : STA $7ECD20
REP #20			;forgetting '$' will instead write REP #$00 20 times, meaning a total 40 bytes are written!

Xkas fill functions organzation:

org $808000 : padbyte $FF : pad $808010		;should always be on one line
fillbyte $ff : fill 16
super/expert_guides/asm_stylesheet.txt · Last modified: 2019/10/01 08:04 by black_falcon