PHP: parse_str_packet

May 27 2004

https://github.com/soywiz/phprobot

parse_str_packet is a function that I have createed specifically for the ragnarok’s bot I’m doing. It depends on other simple stream string functions that I created to unpack numbers, strings and simple types.

parse_str_packet receive two parameters: &$d$fmt. The first parameter $d in addition to the rest of functions to unpack is a reference to a string variable that contains the packet. In PHP a reference is equivalent to an alias and it is needed to access variables whose name is unknown inside functions. It is pretty useful (and faster) to access for example to variables found in… $ejemplo['valor1'][0]['valor2'], it requires less process; you make a reference and that’s it.

The other parameter is a string that contains the format of the packets and here it is the interesting part.

I have a file containing “the definition” of the packets that I’m already parsing with functions. And I will make, that instead of calling the functions, it will use that format that calling to the function it generates an array with the associated values already parsed.

For example, the first packet of all the received ones is the 0069 packet.

In the file I have a line like this:

0069 a[login_id;account_id;login_id2;last_login;account_sex;servers]llll-z[24]w-bx[rest][a[host;port;name;users;main;newn]rlnf[ip]wz[20]www]

It is pretty hard to understand, but this allows makes the code much more clear, without needing 20 lines of code to parse the code.

It works like this:

It iterates the string looking for characters that do things. For example: * a[list separated by ;] a allows to define the name of the keys of the parameters that are going to be defined (in order). * l extracts a DWORD (32 bits) from the string in local order (unless you define it otherwise) * w is like l but extracts a WORD (16 bits) * b is like w and l but extracts a BYTE (8 bits) * - removes immediately the previously parsed parameter. For example, a parameter that it is not required, can be extracted. Or a parameter kept due to compatibility reasons but not used anymore. * z[X] extracts a stringZ in the following X caracters. You can use the “rest” parameter to extract the remaining of the string. * s[X] does the same as z[X] but it keeps the whole string including the NULL (0) characters * r inverts the order of extraction of the packet to LOCAL, using the INTERNET order (big endian). (used by IPs) * n uses the LOCAL packet order * f[function] applies a filter to the last specified parameter (calls to a php function using as parameter the last used packet). IP is defined as synonymous of long2ip.

Now, the powerful part :) :

  • x[repeat][expresion] extracts REPEAT times a expression “EXPRESION” and introduces the result of that parameter as an internal array. REPEAT allows to use a number, “rest” to execute it until the end of the string or to use a previously defined parameter. For example, if we want to repeat the number defined two packets before, it is used as REPEAT: p:2

Aaaaand that’s all. For now I was able to define all the packets, but if it doesn’t allows me to define some packets I guess I will have to touch it a bit. And I have included it in the PHPWiz system.

The objective of all this is that the reception functions do the stuff instead of also doing the parsing. It is slower, but has a lot of advantages (previously quoted ;) ) and some more like, changing a single file, making small changes on the protocol without changing the code.

In any case, and since it is done already it is super cool comparing to other bots that have a huge SWITCH where they define the size of the packets and extracts everything there. Thanks to the PHP power, I simply define a function like:

recv_serv_0073

and at runtime PHP detects eh function and associates it to the packet (by only defining it [no need to write it in any other place]). In addition I use an array with all the sizes of the packets and that extracts the packet using a gneric class. It obtains the ID of the packet, it checks the list… obtains teh size, and if the size is variable it extracts the following two btyes, and extracts a string with the packet and the ID, and then the function is called.

I have also noticed that that way of creating packets is a bit confusing, and makes that if you don’t know exactly the packet size, you can lose information and you cannot recover the stream (different from the WoW’s protocol) that indicates always t he size of the packet, so even if you don’t know it, you can ignore it and continue with the next packet. I’m complaining? No. Gravity ensures that way that it is a bit harder to make a client ;)

And that’s all more or less. I have noticed that message that I placed before was wrong. I initially though that it was the encoding but I have now noticed that it was not the case.

UPDATE: Yes, it was the encoding… but from the form where I wrote the post :P after that I have updated the encoding, I have checked which one was required:

I have checked this: http://www.intellidimension.com/default.rsp?topic=/pages/rdfgateway/reference/script/response_charset.rsp

I have changed the encoding to iso-8859-1, I have noticed that windows-1252 would work too but the windows word… well… :P ISO sounds better ;) Furthermore ISO, we have standards for something :D