Reverse Engineering a Gameboy Advance Game: Extracting the Objects from the Level — Part 6
This post is part of a series entitled Reverse Engineering a Gameboy Advance Game. Read the introduction here. Read the previous post here.
Follow me on Twitter to more computer fun 🐦
In our last post, we ended up extracting the level tilemap, however, a level has more than just the tilemap. Another important feature is its objects, such as diamonds and monsters. As was already explained in the third chapter, the objects are mapped in a section of memory called the OAM (Object Attribute Memory).
So lets include object plotting in our POC, to be even more complete than just plotting the tilemap!
We can start using No$GBA’s Vram Viewer panel again to debug the OAM visualizing the memory address where a monster is stored.
And as we can observe, this monster is at address 07000008. That’s expected, since we’re visualizing only two objects (Klonoa and the monster), Klonoa is always the first object, and each OAM object takes around 8 bytes.
Okay. We’re already in the sixth chapter, so we’ve come a long way in our glorious reverse engineering.
Do you remember the third chapter? Okay. In it we discovered that the OAM is updated via DMA, using several bytes starting at address 03004800 as a source. Also, we discovered that the objects, during an update period of each frame, are overwritten by garbage (instruction 08005D00) and in the following instructions the objects are written again, but with updated values.
So let’s reuse our accumulated knowledge and go directly into step-by-step debugging where the objects are updated; we’ll search for the instruction which writes the position of the monster to discover where the source of information about all of the level objects is located, and what is the logic behind it.
As we said in the previous paragraph, updating the objects is done in the instructions following 08005D00. And we need to look out for what effectively updates the position of that monster in the source of the data for OAM, that is, updates the address 03004808 of Fast WRAM (03004800 + 8).
Hey! Instead, in this case, since this monster is walking horizontally, and the bytes referring to its X position are the third and fourth, we’ll look for when the values of bytes 0300480A:0300480B are updated.
Stepping through the debugger, we can observe that the new position of the monster is written by the instruction in 08006C1E. Reading the assembly and analyzing the flow of information, we realize that it calculates the byte to be written using a constant value as its base: 03000820. Besides that, the constant 03002920 is also used.
In the reverse engineering process, whenever you encounter a constant value, thank the gods, since that’s one of the few things you can count on to guide you. And these values appear interesting. So let’s see what’s in these constants, starting with 03002920!
And just look… we already talked about 03002920 in the third chapter… It’s where Klonoa’s absolute X position relative to the map is stored… besides that, we discovered just now that the position of our monster is just a few bytes afterwards… this really is something that calls a lot of attention, no?
And that’s not a fluke! Looking closely at the behavior of these bytes, we can see that we really are in a section of the game’s memory which stores the state of all objects in the whole level, including what is outside of the player’s view!
We can realize this by executing the game and seeing that the values change with a consistent pattern, just like we can also grab a monster and see that the values of a few bytes change as Klonoa walks — a fact which demonstrates a clear correlation. And that is really amazing!
We say that this memory region would be like a Global OAM architected by the people who developed the game. And after some time playing with it, changing values and seeing how they behaved, we managed to understand how it works better. For example, I noticed that each object occupies about 28 bytes and, curiously, the Global OAM also stores other information which I didn’t expect, like the objects related to the animation when Klonoa fires a wind bullet to grab a monster.
And I believe that, on each frame, the game checks what is visible and in the intersection of the Global OAM with the player’s view, to see which objects should be included in the OAM — and therefore it knows whether or not they should be shown on the screen. This explains the logic of why when updating OAM in VRAM everything is replaced with garbage and afterwards by the already-updated values, after all, it’s easier to delete everything and afterwards write what needs to be maintained.
That is, based on this new information, what we previously thought was only garbage placed by the instruction 08005D00 during the update period of OAM, is actually just the default value for each slot of OAM to say, “there’s nothing here”.
Nice, now that we’ve discovered this region of memory, which we’ve decided to name the “Global OAM”, we’ll find exactly where the position of the monster is written at the moment the level is loaded.
It’s very easy to find a monster in Global OAM. Just have Klonoa grab it and walk around. Then the bytes of the monster’s X position will be updated according to the character’s walking. And this way we found the respective bytes which we need to track: 03002AA8.
So, we’ll track when the monster is written to this byte for the first time, in order to discover where the initial data for the Global OAM is located in the ROM!
Just like we said a few chapters ago, the level decompression is done in the instruction 08051440. And we can note that before it was called the Global OAM wasn’t written yet.
Debugging step by step, we realize that the Global OAM is written in steps. First Klonoa’s object is written, afterwards some other objects… And only after the function in 08053CD4 is called are the monsters’ objects written. Stepping into this function, we come across another function relevant to writing the monsters, which is called in 0800B602… Stepping in that, we see that the position of the monster is written in the function called in 0800B634… and finally we find which instruction writes the position: it’s in 0804505C, which writes the value of R0 to the address pointed to by R3, which in the case of our monster will obviously be 03002AA8.
Okay. Looking for where it got that value, we notice that it uses the value of R5 as a base, which previously received the constant value 080E2B64. (Hey… did this paragraph end up being confusing? Yeah, it did. But relax, we just dove from function to function in the assembly, and we analyzed the passing of data from one corner to another, and what’s important is that we found an important address: 080E2B64! Be happy with it!)
And going to this address… Look here, this is where the initial data for the level objects is located!
It’s possible to see this just by noticing similarities with the values of the Global OAM and seeing the changes propagated by the game when changing a few bytes.
We could call this the “skeleton layout” of the level OAM.
For example, here we manage to change the spawn position of a monster and the type of each object.
Something curious is that the ID of the object’s type isn’t related to its sprite. Thus we can do some very funny things, as the GIFs below illustrate.
From now on I will call this new memory region ROM OAM, to differentiate from the Global OAM (OAM in WRAM) and from OAM (provided by the GBA itself).
The information flow is ROM OAM → Global OAM → OAM.
Another thing I noticed is that each object in the ROM OAM occupies much more space than in the Global OAM: about 44 bytes. The reason for this is that the Global OAM isn’t really “Global”.
Explaining better: each level in the game Klonoa is divided into phases, and the Global OAM only holds the information for objects present in that phase. This makes sense, so it can save memory and processing, since it reduces the amount of data to be computed on each frame. So, whenever Klonoa moves to another phase in the level, the same function which populates the Global OAM when entering a level is also called, in order to reset the Global OAM and leave it with only the objects present in that phase — and of course, using the ROM OAM as a source, which stores the data for objects in all phases of the level.
To summarize this flow: the game’s ROM contains a list of initial object data for each level (“ROM OAM”). The game reserves a section of WRAM (“Global OAM”) to hold the current state of each object in the current phase of the current level. When the game loads a phase, it reads the ROM OAM to initialize the objects in Global OAM; Global OAM changes as each object moves around or changes state. Finally, if an object should be visible on the screen, the game loads its sprite data into OAM. The GBA’s video processor reads OAM and draws the requested sprites on the screen.
What data does the ROM OAM store? All of the initial positions for each object, one for each different phase, besides what type that object is. It also stores some other information, like for example: if the monster is a Moo, should it start walking to the left or to the right; if the monster is a Flying Moo, how long he can fly.
How did I discover that? Again, just by altering the bytes and seeing how it behaves in the game! It’s very fun to do that, by the way.
The diagram below illustrates what I managed to map out.
Okay. We found the ROM OAM for the first level! And where will it be for the second level? Could it be further down? That’s a very easy hunch to test!
And when I went down, after many sequences of 00
’s, I found a section with several bytes very similar to the ones before! And changing some of these bytes… yeah! It really is the ROM OAM for the second level! And doing that again, I found the one for the third level!!
Cool, now that we understand how this is architected in the ROM, we let’s draw them in the level! For that, we need to modify our JS code to be able to read this byte structure.
Notice that reading the OAM will be very different from reading the tilemap. While for the tilemap we only needed to go through a vector and start painting the pixels, in the case of OAM we need to map different bytes for an object in JS. To facilitate this new task, we should look for a JS package in NPM. At first I thought of using a library called qunpack, since it seemed to me to be easier, as it’s very similar to Perl’s unpack, with which I was familiar.
const [
xFirstPosition, yFirstPosition, xSecondPosition, ySecondPosition,
xThirdPosition, yThirdPosition, xFourthPosition, yFourthPosition,
xFifthPosition, yFifthPosition, kind,
] = qunpack.unpack('v2 x4 v2 x4 v2 x4 v2 x4 v2 x5 c', bytes);
However, hey, this code is kind of weird for anyone who hasn’t memorized what the heck is each of these letters followed by a number: v2 x4 v2 x4 v2 x4 v2 x5 c
! Happily we’re not programming in Perl, so let’s look for something which is more semantic in the JS universe!
Investigating further, I found this fantastic library by substack: binary. With this, we managed to have much more semantic JS code:
const {
xPhase1, yPhase1, xPhase2, yPhase2,
xPhase3, yPhase3, xPhase4, yPhase4,
xPhase5, yStage5, kind,
} =
binary.parse(memory)
.word16lu('xPhase1')
.word16lu('yPhase1')
.skip(4)
.word16lu('xPhase2')
.word16lu('yPhase2')
.skip(4)
.word16lu('xPhase3')
.word16lu('yPhase3')
.skip(4)
.word16lu('xPhase4')
.word16lu('yPhase4')
.skip(4)
.word16lu('xPhase5')
.word16lu('yPhase5')
.skip(5)
.word8lu('kind')
.vars
Nice, and having it draw after checking if a pixel has an object will wo… oops! Hmm… It didn’t work… There’s still something missing… Oh! Check it out: the scale of the objects isn’t the same as that of the tiles! They’re in a scale 8x smaller, which could be useful to have finer adjustment of exactly where each object should be. Therefore, let’s multiply the x and y by 8…
Yeah! It worked!! Now, besides drawing our tilemap, we’re drawing the level objects! Now our POC is worthy!
You can view the source code in my gist.
As you saw, extracting the OAM was extremely simple, once we gained the knowhow from our experience extracting the tilemap. Besides that, we were lucky that the OAM data for each level isn’t compressed, which made it much easier when we read the data.
This same knowhow for how to extract the tilemap and the objects can be abstracted and used to obtain other level data, such as the portals between the phases of the level, as well as Klonoa’s initial position in each level. I won’t go into those in these posts, since it would be very repetitive, and we have other, more fun things to discuss.
After all, in the next post we’ll start talking about something very different! Now that we have a more complete POC, we’ll start to code our real project, beginning with drawing the tilemap in the browser! Starting with the next post we’ll talk less about how to do reverse engineering and more about how to apply what we’ve obtained to be useful in our fun personal project, a map editor webapp for Klonoa’s Gameboy Advance game!