Making pidgin behave better across gchat

posted Dec 9, 2012, 9:46 PM by eso   [ updated Dec 9, 2012, 9:48 PM ]

I don't know about anyone else, but gchat's stupid idiom of translating *text* to text bothers the crap out of me.

I finally got around to making a plugin, which can be found over here on dropbox.

If you're running on windows, the process is like this:

Step one:
Important: It must be 5.10.x. Pidgin will not acknowledge newer versions.

Step two:
in %appdata%\.purple create a plugins folder (if it doesn't exist already) and in that folder, place the file
If you're unfamiliar with the %appdata% token, if you type that into Windows Explorer it takes you to the application data folder for the currently signed in user. It's a handy shortcut.

Step three:
Restart pidgin.

Step four:
In tools -> plugins, enable gchatformat

From this point on, you should be able to communicate with adium and pidgin users without asterisks or underscores being consumed automatically.
Note, this won't have any effect on anyone using an official google client. Nothing to be done there. :(

For anyone reading the plugin code.. Yes. All I'm doing is adding <b> </b> to the end of each message. It's stupid, but it works.

Creating a Purebasic wrapper for Chipmunk, Part 2

posted Nov 8, 2012, 1:43 AM by eso

When last we left off..

Step five: Create an automated tool to generate a full set of shims for the entirety of chipmunk, along with appropriate .desc files.

It turns out, this is actually a few different distinct tasks all deceptively bundled together.

So it’s more like:

Step five substep one: Assemble a list of all of the functions in the library against which we need shims.

Happily, I get to lead off straight away with leaning on a tool that ships with the chipmunk library: extract_protos.rb

# match 0 is the whole function proto
# match 1 is either "static inline " or nil
# match 2 is the return type
# match 3 is the function symbol name
# match 4 is the arguments
PATTERN = /.*?((static inline )?(\w*\*?)\s(cp\w*)\((.*?)\))/

IO.readlines("|gcc -DNDEBUG -E include/chipmunk/chipmunk.h").each do|line|
   str = line
   while match = PATTERN.match(str)
       str = match.post_match
       proto, inline, ret, name, args = match.captures.values_at(0, 1, 2, 3, 4)
       next if ret == "return" || ret == ""
       inline = !!inline
#        p({:inline => inline, :return => ret, :name => name, :args => args})
       puts "#{name} - #{inline ? "static inline " : ""}#{ret} #{name}(#{args})"

This adorable little block of ruby code basically boils down to a regular expression, grabbing a few of the backreferences, and then lining them up in a happy little row.

This is a fantastic start, honestly. So, easy enough, let’s just drop that regular expression into PureBasic’s PCRE library!

Issue one: The pattern as listed isn’t quite right. If we don’t remove the enclosing slashes ( /  … / ), it’s not gonna work. At least this is trivial to fix. (I’ve been informed this is because the slashes thing is some kind of idiomatic alternative to quotes for regular expressions in some languages. Whatevs, I’m not here to learn ruby or perl or whatever.)

Issue two: PureBasic just seems to want to return a single result with MatchRegularExpression().

So, hey, here’s this ExtractRegularExpression() function! That’s totally what we want, right?

Sadly, no. All that does is return all of the matches in an array.. but not any of the individual backreferences. For this, we have to actually interface with pcre directly. Whee fun.

ImportC ""
 pb_pcre_exec(*pcre, *extra,subject.s, length.i, startoffset.i, options.i, *ovector, ovecsize.i)

regexp_handle.i = CreateRegularExpression(#PB_Any, ".*?((static inline )?(\w*\*?)\s(cp\w*)\((.*?)\))", #PB_RegularExpression_AnyNewLine)

Dim pcre_results(18)

string.s = nextline
string = ReplaceString(string, #TAB$, "")
resultcount.i = pb_pcre_exec(PeekL(regexp_handle),0,string,Len(string),0,0,@pcre_results(),18)

In this sample, ‘nextline’ is actually from a file read loop, more or less duplicating the pattern of the ruby script above. The exciting part is the @pcre_results() array, which is essentially an output vector the size of 3 * resultcount.

Now, the output vector contains a whole bunch of fancy offset stuff... but honestly, that’s too much of a pain in the ass. So happily, diving further into the pcre API, we find these:

ImportC ""
 pb_pcre_get_substring(subject.s, *ovector, stringcount.i, stringnumber.i, stringptr)

With these, we can fetch individual substrings (or backreferences, or captures, or groups, or whatever term tickles your particular fancy).

The trickery with stringptr is kind of fun; the native C binding for pcre_get_substring wants as its last argument **stringptr, but PureBasic does not actually support ‘pointer to a pointer’. So I simply pass it a pointer to an integer value, and then use that integer value as an argument to PeekS() to fetch the target string.

Of course, using these directly is seriously a pain in the ass, so I wrote a helper function so I wouldn’t have to remember to free all these strings all the time:

Procedure.s get_substring(string.s, *ovector, resultcount, stringnumber)
 Define outputstring.s, stringptr.i
 pb_pcre_get_substring(string, *ovector, resultcount, stringnumber, @stringptr)
 outputstring= PeekS(stringptr, -1, #PB_UTF8)
 ProcedureReturn outputstring

This greatly simplifies things. Now, after that pb_pcre_exec() call, I can do this:

returntype.s = get_substring(string, @pcre_results(), resultcount, 3)
functionname.s = get_substring(string, @pcre_results(), resultcount, 4)
arguments.s = get_substring(string, @pcre_results(), resultcount, 5)
If returntype = "return" Or returntype = ""
Debug "Name: (" + functionname + ")  Return Type: (" + returntype + ")  Arguments: (" + arguments +")"

This is found inside the same file read loop as mentioned above, and happily returns what I need, more or less matching the results of the ruby script. Victory!

Well, not quite. There’s still more catches.

Issue three: I kind of need the comments.

The ruby script runs the chipmunk.h file through gcc’s preprocessor to strip out irrelevant things, like all the conditional ifdef nonsense.. but that also happens to strip out the comments in the .h files that function as documentation. I’m going to need this for the .desc files, so that the PB IDE can do its tooltip/usage hint thing.

The answer to this is unfortunately I’ll have to do some hand massaging of the .h files. Concatenate them all into one file, and manually edit out everything that isn’t relevant/valid. This fortunately doesn’t look like a whole lot of work.

For issue three, the straightforward solution is to keep the last few lines of text. Each time I have a successful match, scan the previous few lines for ones beginning with a triple slash (the chipmunk .h files consistently use /// as their flag for documenting comments), and grab those to put with the output of a particular function. Keeping a small buffer of previous lines is pretty easy, fortunately.

For the preparatory step, I concatenated most of the .h files into one overall document. I omitted: chipmunk_ffi.h, chipmunk_private.h, chipmunk_unsafe.h, and util.h in the constraints folder.

Following that, I made the following edits:

  • Comment out cpmessage(), as it appears to be only really used internally (and I don’t know how to convert it anyway)
  • Comment out cpchipmunkinit(), as it is deprecated.
  • Delete the defines that involve ‘blocks’ as I’m not using a compiler that supports them, also delete the C++ bits as I’m working with this as C
  • Convert two of the cpSpace defines from multi-line to single-line declarations ( cpSpaceSetDefaultCollisionHandler() and cpSpaceAddCollisionHandler() )
  • Comment out cpConstraintActivateBodies and cpConstraintGetImpulse as they both appear to be private.

With these edits, I seem to have most of the necessary things. Because I’m not running this through the pre-processor, I lose out on all the get/set functions, but I probably won’t need those anyway.

Result: 226 functions found (versus the 311 that come out of the ruby script’s approach, iterating over the results of the gcc preprocessor). Considering how much is getting omitted (all the private stuff, all the get/set functions), I think that’s a pretty good haul for automation.

Issue four: Oh, yeah, those whole ‘argument’ things.

See, in order to really get work done, I need to break down the actual arguments into a coherent set. This won’t actually be terribly hard, because comma separated lists of two words (or sometimes three) aren’t too much trouble to work with.

The trickier part is deciding what to do with them; knowing when to do a type translation, when to turn the wrapper’s version into a pointer or not, preserving pointer arguments where needed.

I’m going to have to chicken out a bit here and not go into detail on what I did, in no small part because I ended up not completing this as a discrete step, and instead rolled it in as part of

Step five substep two: output my shim.c and chipmunk.desc files

I got the C output working first, and it’s incredibly ugly. The resultant source code doesn’t really deserve detailed attention, so I’ll just broadly discuss the engineering challenges.

First, I had to distinguish between pointers and not pointers, along with what things I actually needed to be pointers. I applied the following rules:

  1. If chipmunk is dealing with a by-value object, and I can work with that object by value, I have the declaration and body identical.
  2. If chipmunk is dealing with a by-value object, and I need a pointer (because it’s a structure), then I have a * in both the function declaration and in the function body.
  3. If chipmunk is dealing with by-reference, regardless of whether reference to an object or reference to a reference to an object, I need the original * or ** in the function declaration, but no * or ** in the function body.
  4. If the return type needs to be by-reference and isn’t in chipmunk, the function declaration gets a ‘void’ return type instead and a ‘*result’ argument inserted at the beginning. I adjust the function body to include an assignment instead of having the function return anything.

Honestly, I really think the way C handles its pointer syntax is freaking retarded, but I suppose it makes sense to somebody. Whatever. I have a working set of rules I can have my parser handle for me, and that’s the important thing.

The list of types for which I know I don’t need a pointer is fairly short:
"cpFloat", "int", "cpLayers", "cpGroup", "cpBool", anything ending in “Func”, “void” (it’s still a type! technically)

For any of those, I just copy it as-is, except when rule #3 above is in play.

With all that in place, my parser happily produced a .c file that gcc compiled without errors (or even warnings).

I created a library file, hand-edited my .desc to include a few entries, used the SDK to produce the library.. and got the desired results from the test file.

(In the process, I did discover that the constraints logic was NOT being included in my build process, so I had to add those. Whoops. Easily resolved; compile those .c files, and added the resultant .o files to the ar command to produce the .a. It’s a dotletter party!)

The .desc file is next. This requires some substitutions, and re-ordering of how I assemble the text. It’s not terribly complicated, but it did require more logic in the parser to know what substitutions were needed.

One of the gotchas I didn’t really plan on is that every definition needs to be cdecl. It didn’t occur to me that calling conventions would be an issue, but it turns out they are. Once I got that sorted out, things went a lot more smoothly.. at first.

Step Six: Implement the Purebasic side of things

This got exciting.

Unless there’s some mechanism available in PureBasic’s library tools I’m not aware of, all the structures and whatnot (helper functions, macros, the like) still need to go in a purebasic include file.

Mapping out the structures seemed pretty straightforward at first, though working out data types was a little finicky. I ended up deciding to use the data types found in the original source.

Doing this meant adding a lot of macros like these:

Macro cpTimestamp

Macro cpBool

This is in addition to the trickery needed to ensure the cpFloat type works right on 32-bit macOS targets (are there still those out there? I have no idea), since that’s double everywhere EXCEPT 32-bit macOS, where it’s a single-precision float.

It works, though. It helps keep things clear as to where things came from, and it helped massively in figuring out why things were crashing so constantly.

It turns out that gcc was helpfully aligning objects in their structures for performance. Notably, doubles were being aligned to 8-byte boundaries. This is all well and good, but PureBasic doesn’t know about gcc’s alignment habits. At first I was just blindly adding padding, but that quickly became obviously not a sensible way to go about business. After some bribing of my pet C expert research, I discovered two approaches.

One was to just tell gcc not to pad structures, using the -fpack-struct command line option. This is pretty simple for me, but it has potential performance penalties. The other is to do a test run using the -Wpadded command line option to find out where gcc was opting to pad structures.

At first I tried this on the original source code and got thousands of lines of text. No good. Then I realized they were all repetitions, and I was passed the hint of creating a file called t.c with these contents:

#include “chipmunk.h”
int i;

Fed these into gcc using
gcc -I../include/chipmunk -Wpadded -std=gnu99 -c t.c

Results, after trimming out the excess:

../include/chipmunk/cpBody.h:86:10: warning: padding struct to align 'v_limit'
../include/chipmunk/cpShape.h:33:9: warning: padding struct to align 'p'
../include/chipmunk/cpShape.h:43:10: warning: padding struct to align 't'
../include/chipmunk/cpShape.h:86:10: warning: padding struct to align 'e'
../include/chipmunk/cpPolyShape.h:38:1: warning: padding struct size to alignment boundary
../include/chipmunk/cpArbiter.h:172:4: warning: padding struct to align 'points'
../include/chipmunk/constraints/cpConstraint.h:83:1: warning: padding struct size to alignment boundary
../include/chipmunk/constraints/cpDampedSpring.h:40:10: warning: padding struct to align 'target_vrn'
../include/chipmunk/constraints/cpDampedRotarySpring.h:37:10: warning: padding struct to align 'target_wrn'
../include/chipmunk/cpSpace.h:34:9: warning: padding struct to align 'gravity'
../include/chipmunk/cpSpace.h:81:1: warning: padding struct to align 'curr_dt_private'

That’s far more workable.

I can disregard the “padding struct size to alignment boundary’ items, leaving me with just 9 different places where I need to add a 4-byte padding element.

Beyond these elements, mostly getting the include file together is hand massaging. I renamed a bunch of struct members for clarity (Changing ‘e’ to ‘elasticity’ for instance), since I’m not implementing most of the getter/setter functions from chipmunk.

Another note for purebasic implementations is any callback that might be registered with Chipmunk needs to be a ProcedureC instead of a regular Procedure, since chipmunk compiled this way is in fact using the Cdecl calling convention.

My next step from here is attempting to test the techniques above on more target architectures. Right now, I’m working on windows x86, since that was an easy path with mingw32. I still need to try to build with mingw64 and PB x64, not to mention trying to get a MacOS build together. (that will only be 32-bit, but that’s a good place to test a lot of the anticipated platform quirks)

After that is simply finishing out the include file and starting to assemble some validation tests.

More on those once I have something to report.

Creating a Purebasic wrapper for Chipmunk.

posted Oct 22, 2012, 12:42 AM by eso   [ updated Oct 22, 2012, 12:44 AM ]

Step one: compile Chipmunk in mingw as a .dll

This was pretty easy. Following the history on this page I eventually landed on this process:

Unpack archive. Navigate to src tree.

gcc -I../include/chipmunk -O3 -ffast-math -std=gnu99 -c *.c

Success so far; no errors, no warnings.
(using -fPIC as per the forum post above spits out a warning that -fPIC is irrelevant because all code is position independent)

gcc --shared -o chipmunk-6.1.1.dll -Wl,--out-implib=chipmunk-6.1.1.dll.a -Wl,--output-def=chipmunk-6.1.1.dll.def cp*.o chipmunk.o

This produces a .dll that I confirmed is valid using dependency walker.

Step two: test .dll in purebasic

This one was more awkward. I didn’t know at first that mingw was producing x86 code, and I was trying to use x64. I eventually worked that out, and started using the x86 version of purebasic.

At first I used infratec’s tool to make a bare-bones almost-wrapper.. then I added some types..

; Warning! Achtung!
; This will be a Float on 32-bit MacOS installations!
; This will need to be addressed at some point.
Macro cpFloat

; See above. Floats on 32-bit MacOS.
Structure cpVect

and defined a single prototype to start with...

OSPrototype.d Proto_cpvtoangle(x.d, y.d)

And made a very simple test case.


tempvec1\x = 1.0
tempvec1\y = 0.0
tempvec2\x = 0.0
tempvec2\y = 1.0
tempvec3\x = 0.7
tempvec3\y = 0.7

tempdouble = cpvtoangle(tempvec1\x, tempvec1\y)
Debug tempdouble
tempdouble = cpvtoangle(tempvec2\x, tempvec2\y)
Debug tempdouble
tempdouble = cpvtoangle(tempvec3\x, tempvec3\y)
Debug tempdouble



So far, so good.

Around this point, I started realizing a few things. One, an awful lot of chipmunk involves passing structures by value, not just as arguments, but as the return from functions.

At first, I pursued using assembly to play with hidden parameters, but then a simpler solution occurred: Make a shim in C that makes GCC do all that work for me.

Since I don’t HAVE to use this as a .dll, and in fact in the long term I don’t want to, the ASM approach is highly suboptimal.

Step three: create a test shim in C to make GCC do all the work for me

This immediately presents a problem: I don’t know C. Okay, let’s back up.

Step three: bribe a C programmer to tell me exactly how to make GCC do the work for me.

For this process, first you need to befriend an expert in C. If you don’t have one, well, good luck.

Through a careful series of [redacted] I managed to create the following test case.

Using one of the inline functions from cpVect.h:

static inline cpFloat cpvdist(const cpVect v1, const cpVect v2)
    return cpvlength(cpvsub(v1, v2));

I created the following shim:

#include “chipmunk.h”

void PB_cpvdist(cpFloat *result, cpVect *v1, cpVect *v2)
*result = cpvdist(*v1, *v2);

I put this in a shim.c file in the src folder, and went back to the compilation process in step one above.

Result: a .dll file with one extra function in it than before.

I created a new test .pb file:

Prototype.i proto_PB_cpvdist(*result.cpFloat, *v1.cpVect, *v2.cpVect)
Global PB_cpvdist.proto_PB_cpvdist
chipmunk = OpenLibrary(#PB_Any, "chipmunk-x86.dll")
PB_cpvdist = GetFunction(chipmunk, "PB_cpvdist")


vector1\x = 1.0
vector1\y = 0.0
vector2\x = 0.0
vector2\y = 1.0

PB_cpvdist(@dist, @vector1, @vector2)

Debug dist

(Not pictured: the previously indicated macro and structure declaration from step Two)



Well, that sure looks like it’s working. Just to be sure, I also tested 1.0,0.0 against 0.0,0.0, and got 1.0. Equal vectors result in a dist of 0.0. That’s enough to believe that it’s probably okay.

Step Four: Make a purebasic library.

This seems simple on the surface. The forum post I found here implied that the .a output from mingw would work fine for this purpose.

I thought: Oh hey my .dll creation process ALSO created a .a file! I’m saved!


Not even a little.

Even after correctly making a chipmunk-x86.desc file...

 ; Langage used to code the library: ASM or C
 ; Number of windows DLL than the library need
 ; Library type (Can be OBJ or LIB).
 ; Number of PureBasic library needed by the library. Here we need the Gadget library
 ; Help directory name. Useful when doing an extension of a library and want to put
 ; the help file in the same directory than the base library. This is not a facultative
 ; result.
 ; Library functions:

cpvdist, Long, Long, Long (*Result, *Vector1, *Vector2) - Result is Float/Double. Vector1 and Vector2 are cpVects

… it turns out that, no, that doesn’t work. Because the code was compiled as a shared library, the stub in the .a immediately tries to load in the .dll file, and things go sad from there.

At first I thought I would need to follow some complicated process involving using Microsoft’s Visual Studios tools to create a proper .lib file..

But I thought back to milan1612’s post in the PB forums, and did more digging.

This was the result of my search. Specifically, the following instruction for creating a .a file using mingw:

ar rcs libadd.a add.o

I thought: There is no way it could be that simple. There has to be more to this story.

Spoiler: Nope. That’s exactly what I needed.

ar rcs chipmunk-x86.a *.o

After feeding the .desc file and .a file (helpfully renamed as .lib) into the LibraryMaker.exe, I had myself a library. I dropped it into the userlibraries folder, and edited my above test code to use cpvdist instead of PB_cpvdist.



Success! This process does, in fact work.

Step five: Create a full set of shims for the entirety of chipmunk, along with .desc files for x86 and x64 targets.

No, wait, that’s way too much work.

Step five: Create an automated tool to generate a full set of shims for the entirety of chipmunk, along with appropriate .desc files.

That sounds better.

This is still quite a bit of work, since it requires some infrastructure construction, but it has improved maintainability. I can maintain a single list of all functions, and have a tool autogenerate the shim and .desc files for any target architecture I desire. This should greatly simplify tracking the upstream library.

More on this later.

still active.

posted Mar 9, 2012, 6:04 PM by eso

at present, working on the compiler and vm for the language that will be used in future projects.

it's called frf, for no reason in particular. forth-like with some non-forth-like elements. stack based but managed. no direct pointer work, heavily multithreaded. lots of inspiration from erlang.

progress is slow, but steady.

initial update

posted Nov 26, 2011, 5:16 AM by eso

site in place. added basic versions of files.

boids and sabo are bare bones demos, and currently the engine (such as it is) backing them is being reworked.

1-5 of 5