pspemu (V)
March 26, 2010
Some days ago my psp emulator has 77 subversion commits. Today it has 99.
Now there are a few graphical demos working. A few more than the revision 77, but of course, still less than the version I did a few years ago. Also it doesn't have a graphical debugger.
But what I can warrantee is that it is much better in terms of code (even if it requires dome refactoring, cleanups and documenting). Furthermore in this time I have learnt a lot of things about D and about programming patterns, so I was able to better implement typical problems.
After this version I'm a bit more calm. I have been a few days working super hard on it. I have done things that would have required a few weeks for a few persons to do. Maybe it is fue to spring that makes people more active, but at any rate, being motivated is something you should take advantage from, and I needed to disconnect from everything for a while.
I will continue with the emulator, but at a slower pace. At least on workdays :P
Regarding to the things I said I was going to fo, I have done almost everything I said.
The most interesting thing I have done (and relatively newer regarding to the last time) has been to implement threads and semaphores.
A few things:
- I have implemented input. Now interactive demos that use button pressing are working already. This time I have implemented button input in a way that I believe is more similar to the reality. The last time it just saved the latest state. Now from time to time I'm producing what is called a "frame" using a ringbuffer. The API allows to read the latest or a set of the latest input frames available.
- Did a few tries using SIMD (SSE). Initially I thought that vertex processing and matrices, especially in the case of GU_SPRITES should be done via cpu. So I created a few structures/clases Matrix and TVector(Type, int Size = 4) -> Vector!(float, 4). From one hand so I could try to create structures created dynamically via D templates, and in the other hand to be able to create specialized classes/structures. Vector!(float, 4) specialization, makes the vector operations to be executed using SSE extensions. I implemented internal and external operations. External: opAdd/opSub/opMul /opDiv(float), Internas:opAdd/opSub/opMul /opDiv(Vector). I implemented Matrix multiplication against Vector in order to make transformations. But in the end it seems that maybe it won't be necessary the implementation I did, since Im going to use Geometry Shaders to create the two remaining vertices (tesselation). This is something relatively new that I haven't tried yet, and this is a great opportunity to try it, in addition to have a huge performance boost. At any rate having Vector and Matrix classes implemented is something useful to have. It served to make some tries in the case sometime I make a software-based implementation of the GE. In fact I have a prototype already created.
- Started audio support. Right now it works partially. It generates the audio wave, but doesn't play right.
There are a lot of pauses and it sounds pretty bad. But at any rate, it is a good initial approach.
At the beginning it didn't work at all, even if I had implemented threads. Didn't know why.
Threads were being executed, but the wave wasn't generated.
After that I saw that there was a problem with the
implementation. It happened thatsceKernelStartThread
makes the switch to that newly created thread immediately. And making use of that feature, the audio thread received an argument to a volatile parameter that was changed from the caller thread immediately. That parameter was at the stack of the parent thread, and it was even being changed inside a for. Since before another thread switch there are a bunch of instructions that will be executed before, the first thing done is to copy the passed parameter to a local variable. My implementation was not executing that thread immediately, but placed it in the thread queue waiting to be executed by theThreadManager
, that way the thread was executed later and when executed the variable already changed its value. In this case it happend that all the created threads were processing the channel 4, instead of 0, 1, 2, 3 respectively :P For curious people:pspsdk\src\audio\pspaudiolib.c
- Implemented interrupts. I changed how the vblank handling worked, that was a bit dirty. Now it happens in a registered event happening with the VBLANK interruption. That stabilized a lot fps. Before that some frames were doing weird stuff.
- Regarding to OpenGL, the implementation. Now I have implemented texture swizzling and there are a few more opcodes implemented.
For example the
opcode. I'm not taking too much attention to the GPU just yet. The important part is the CPU that now works more or less fine and the kernel functions, specially the ones using threads and modules that are the ones that are breaking most of the stuff. - I added a
. Before now HLE modules were being loaded statically, that prevented being well tested from one hand, and on the other hand, new binaries at runtime. Or hacing several executions. - Embedded the
inside the executable and made it to show information about modules and not implemented NIDs. - Implemented a pretty important part of the file handling module. Using as base the
I created to map physical folders, allowing to create proxy entries, remapping other things, or in general creating virtual file systems. It is something I already did in the last version, but now I have implemented it much better. (I haven't still implemented the ISO implementation for the VFS). After that moment, some programs that used files, have started to work. (IncludingSDL
). Though I implemented this before threads, and SDL demos were crashing as hell because theSDL_Init
function was not even finishing. - Created a utility to detect infinite loops in functions that were working for other components. This kind of functions should be avoided favoring callbacks, but since it was easier and we have to increase complexity little by little, right now it does its work. Specially in components that are being executed in different threads that are THE PAIN.
- I added a debug dump key (
) that in addition to showing registers and dumping instructions near the current PC, it shows the threads and active semaphores. - Changed the executable generation system. Before you had to use
files. I used PHP. This PHP in addition detects dependencies and compiles necessary files without having to specify them manually. This kind of things are being done by other utilities like BUD/REBUILD and DSSS. In the end I will probably use DSSS, but for now I wanted to had a more fine-grained control and an alternative option. - I med a lot of refactorings and simplifications to the codebase. But I have to still improve a lot in this regard. There are things that are screaming for refactorings and cleanups.
- I fixed a lot of things, added a window menu, an icon and a way to hot load new programs.
Some screenshots of the latest versions:
Texturized cube from one of the NeHe tutorials ported to PSP. It uses texture swizzling, and cube is being manually rotated. It reports between 120 and 200 FPS.
PSPONG Mini-Game.
It works perfectly, but super slow. (5~10 fps)
It must be doing lots of operations via CPU instead of rendering via GPU.
That was expected, since current emulator implementation is interpreted.
JPCSP runs this demo faster, due to its dynamic decompilation and probably because it identify functional blocks and replaces them with native functions (memcpy, memset, etc.).
At this point I don't care about speed. Once interpreted mode is working fine and I have a solid base of executables tests, I will start exploring the rest of the stuff.
On the other hand real games make use of the GE, where this emulator shines and works better than the ones implemented on Java and C# (and GE is still unoptimized).
SDL Demo I found on psp.scenebeta (http://psp.scenebeta.com/tutorial/tutorial-04-mostrar-un-archivo-bmp-en-pantalla).
This demo required a lot of time to get it working since it used SDL. The first version of the emulator was unable to get it working. Here it started to work after I correctly implemented Threads and Semaphores.
Demo loads a PNG file using SDL and SDL_Image.
