Writing a good replacement for LibGS

Second part of the article on how to write fast and reliable code for the PlayStation. More after the jump.

Let’s start by setting the general environment to initialize the console; a structure will be useful for the scope:

typedef struct tagGsEnv
	// rendering related structures
	DRAWENV Draw_env[2];
	DISPENV Disp_env[2];
	DRAWENV *pDraw;
	DISPENV *pDisp;
	u32 OTag[2][OT_SIZE];			// sort tables
	u32 *pOt;						// current OTag pointer
	u16 OTag_id;					// current OTag index, flip every frame
	u8 VSync_rate;					// 0 = 60 fps, 2 = 30 fps
	u8 Clear_mode;					// 0 = clear with rgb, 1 = no clear
	s16 Screen_x, Screen_y;			// option menu adjustments
	u16 Screen_w, Screen_h;			// screen size, internal usage
	u32 *Gfx_alloc[2];				// packet allocators
	u32 *pGfx;						// current packet seek
	CVECTOR Clear;					// clear color

// the actual object, put this somewhere in a .C file and declare it as external in a header
volatile GS_ENV G;

Now the actual code to populate some of this structure:

// This function sets up the draw/display environment
// ---------------------------
// Parameters
// x/y: frame buffers starting position in VRAM 
// w/h: size of display/draw
// mode: bitflag to determine if we're using interlacement or sideways frame buffers
void SetDisplay(int x, int y, int w, int h, u32 mode)
	int x0, x1, y0, y1;

	// copy resolution for later needs
	G.Screen_w = w;
	G.Screen_h = h;

	// interlaced mode
		SetDefDrawEnv(&G.Draw_env[0], x, y, w, h);
		SetDefDispEnv(&G.Disp_env[0], x, y, w, h);
		SetDefDrawEnv(&G.Draw_env[1], x, y, w, h);
		SetDefDispEnv(&G.Disp_env[1], x, y, w, h);
		G.Disp_env[0].isinter = TRUE;
		G.Disp_env[1].isinter = TRUE;
		// frame buffers are stored sideways
		if (mode & RESMODE_SIDEWAYS)
			x0 = x;
			x1 = x + w;
			y0 = y;
			y1 = y;
		// otherwise they are placed vertically
			x0 = x;
			x1 = x;
			y0 = y;
			y1 = y + h;

		// libgpu calls to set up the environment
		SetDefDrawEnv(&G.Draw_env[0], x0, y0, w, h);
		SetDefDispEnv(&G.Disp_env[0], x0, y1, w, h);
		SetDefDrawEnv(&G.Draw_env[1], x1, y1, w, h);
		SetDefDispEnv(&G.Disp_env[1], x1, y0, w, h);
		// disable interlacement, we don't need it
		G.Disp_env[0].isinter = FALSE;
		G.Disp_env[1].isinter = FALSE;

	// enable draw on display area
	G.Draw_env[0].dfe = G.Draw_env[1].dfe = TRUE;

This function does basically what GS functions do to initialize the frame buffer, with 2-3 calls merged into just one. Also notice how I’m not using a million parameters to set up frame buffer mode. That is because we don’t wanna use more than 4 parameters most of the time; remember older versions of the compiler tend to push anything past parameter 4 into the stack, which we don’t want since it kills performance and produces messy binaries. Limit yourself as much as possible when you create a function prototype or it’s going to look ugly and perform worse.

Let’s move to packet allocators, which correspond to Gfx_alloc in the big structure above. You can fill them depending on your need of primitives, but always remember to make them as big as possible for the task (example: sprites for menu interfaces). Go for malloc3 or even a global variable in your program, it doesn’t matter in the end as you probably won’t ever need to resize them at any point of the program’s life. Some example code of how you would populate the rest of the structure:

// dynamic allocation
void InitGfxAlloc(int size)

// static allocation
#define GFX_ALLOC_SIZE	15*1024 // 15 KB buffer

static char _gfxAlloc[2][GFX_ALLOC_SIZE];
void InitGfxAlloc(int size)

// set handy pointers for frame buffer swap
void ResetGsEnv()
	int Id = G.OTag_id;
	// set references for quick access
	G.pGfx = G.Gfx_alloc[Id];	// graphics
	G.pOt = G.OTag[Id];			// otag
	G.pDraw = &G.Draw_env[Id];	// enviroments
	G.pDisp = &G.Disp_env[Id];

// deal with frame buffer swap and clear background if it's necessary
// put this at the beginning of a screen loop
void BeginDraw()

	ClearOTagR(G.pOt, OT_SIZE);

	pD = G.pDraw;
	if (G.Clear_mode == 0)
		pD->r0 = G.Clear.r;
		pD->g0 = G.Clear.g;
		pD->b0 = G.Clear.b;
		pD->isbg = TRUE;
	else pD->isbg = FALSE;

// draw all linked primitives and perform the actual swap for next frame buffer
// put this at the end of a screen loop
void EndDraw()

	// display previous frame
	// set current buffer for display

	DrawOTag(&G.pOt[OT_SIZE - 1]);

	G.OTag_id ^= 1;

// ----------------------------
// this code goes into a header
// ----------------------------

// retrieve current primitive pointer
static __inline void *gfxGetPtr()
	return G.pGfx;

// update packet allocator
static __inline void gfxSetPtr(void* p)
	G.pGfx = (u32*)p;

static __inline u32* GetOTag() { return G.pOt; }
static __inline int  GetBufferIndex() { return G.OTag_id; }

static __inline RECT* SysGetTexWin()	{ return &G.pDraw->tw; }
static __inline u32   SysGetTPage()		{ return G.pDraw->tpage; }
static __inline RECT* SysGetDisplay()	{ return &G.pDisp->disp; }
static __inline RECT* SysGetScreen()	{ return &G.pDisp->screen; }

If you’re asking why I have two allocators instead of just one, the reason is pretty simple: double buffering. The PlayStation expects you to provide two memory locations to store packet data because the GPU takes a while to send them all on screen. It’s not an operation that takes place immediately, so you need a back buffer to store new primitives while the old ones are getting through the DMA.
So, the first set of functions is what makes packet allocators work and provides an environment for frame buffer swaps, while the second slice it how you would retrieve pointers in order to actually draw and seek forward. Most of those static inline functions aren’t actual calls but code that gets copied as-is into the caller, providing no overheat from real calls while keeping your code slim.

For the code above being used in a real dev case, let’s see how that gets pieced together with another sample:

#define SCREEN_W	320
#define SCREEN_H	240

	// set frame buffers to be placed sideways
	G.Clear_mode = 0;				// force LibGPU to clear frame buffers at each swap
	*(u32*)&G.Clear.r = 0x808000;	// set clear color to blueish green
	G.VSync_rate = 0;				// 60 fps mode
	// main loop at the core of the program
	while (1)
		// set allocators for this frame
		// this is where all your logic goes
		// the actual DMA draw and swap

That’s literally all the code you need to replace all LibGS calls that usually take care of setting the environment.

(Visited 12 times, 1 visits today)

Comment on “Writing a good replacement for LibGS”

Leave a Reply

Your email address will not be published. Required fields are marked *