Rants from Vas https://rants.vastheman.com Take a hit with V-Real Tue, 06 Jun 2023 06:10:46 +0000 en-US hourly 1 https://wordpress.org/?v=4.9.10 Kabutogani! (or the XE‑1AP Analog Joypad) https://rants.vastheman.com/2022/12/24/kabutogani/ https://rants.vastheman.com/2022/12/24/kabutogani/#respond Fri, 23 Dec 2022 21:57:16 +0000 https://rants.vastheman.com/?p=361 Way back in the 1980s, the days of the exciting home computer and game console boom, Micom Soft introduced their XE‑1 line of game controller peripherals. From the humble XE‑1B two-button joystick for Japanese home computers, to the XE‑1ST2 switchable 4-way/8-way joystick with support for FM Towns and Sega consoles, to the fully programmable XE‑1SFC Super Famicom stick (with an LCD screen), there was something for everyone. They were known for their excellent build quality, unconventional features, and of course their rotating button clusters.

The XE‑1AJ Intelligent Controller is possible the most impressive, and definitely the most imposing, member of the product line. Also sold as the Sharp CZ-8NJ2 Cyber Stick, it is a large HOTAS flight controller for desktop use. Just the size and weight are impressive: it takes up a lot of space on your desk, the stick and throttle shafts are metal, it’s clearly built to last. It features an analog stick and throttle, a thumb button and trigger on the stick, a thumb button and two-way rocker switch on the throttle, and five panel buttons. The stick button and trigger functions can be exchanged, auto-fire is supported for the stick button and trigger, and a variety of special modes can be enabled by holding panel buttons while pressing the reset button. If that isn’t enough, the stick and throttle positions can be reversed for left-handed use. It can even be switched to digital mode for games that don’t support analog controls.

With the launch of the Sega Mega Drive in 1988, someone at Micom Soft must have started wondering, “How can we bring all the fun of the XE‑1AJ to couch gaming?” The result was the XE‑1AP joypad. Brought to market in 1989, the XE‑1AP was the first control pad to feature analog controls. It was nicknamed the horseshoe crab (kabutogani in Japanese) due to its shape. It actually succeeded in packing all the key functionality of the XE‑1AJ into a hand-held controller, and even added a few features. In many ways, it is the ancestor of modern game controllers, pioneering many features that were only popularised years later. Its features included:

  • Ergonomic shape with grip handles.
  • Analog stick controlled with left thumb.
  • Analog throttle controlled with right thumb; could be rotated through 270° for horizontal or vertical movement in either direction.
  • Two shoulder buttons on each side.
  • Two face buttons on each side, plus Start and Select buttons.
  • Auto-fire support for the right shoulder buttons.
  • Modes to support Japanese home computers and Sega consoles.

The design of the Sega Saturn 3D Controller was directly influenced by the XE‑1AP – the similarity is quite striking. The Saturn 3D Controller was further developed into the Dreamcast controller, which in turn served as the inspiration for the Xbox controller. The Xbox controller was ridiculed for being excessively large, but the XE‑1AP is actually bigger. This is at least partly for practical reasons: it needs space for its 42-pin plastic DIP Fujitsu MB88513 microcontroller and the rest of the components that make it work. Electronics used to be a lot bigger.

It’s interesting how obscure the XE‑1AP has become. Perhaps poor marketing is partly responsible. It was supported by several Mega Drive, Mega‑CD and 32X games, but only mentioned in the Japanese versions of the manuals. The export versions of the manuals never mentioned the XE‑1AP, despite the export versions of the games supporting it in most cases. Micom Soft sold an adapter that allowed the XE‑1AJ or XE‑1AP to be used with the NEC PC Engine. Although five PC Engine games supported analog controls, this was never heavily promoted. Three of them don’t even give any indication when a supported analog controller is detected. The high price probably didn’t help either, but I can’t help wondering what could have happened if the XE‑1AP was promoted more heavily, leading to more adoption, and hence more games developed with support for it.

I recently added support for the XE‑1AP to MAME. In typical MAME fashion, the emulated XE‑1AP can be used with compatible games on several systems, including the Sega Mega Drive, NEC PC Engine, Sharp X68000 and FM Towns families. I’ve also written some basic instructions for using the controller with some of the games I tested.

Sharp provided assembly language source code for an X68000 driver for the XE‑1AJ and XE‑1AP. This may be the reason the X68000 has the best software support, with over a dozen games making use of analog controls. The Mega Drive and PC Engine games with XE‑1AP support use a similar algorithm to read data from the controller, but the games developed by CSK Research Institute for FM Towns use a different approach. Although this meant more effort getting the emulated controller to work across all the platforms with compatible software, it does mean we can infer more details about the controller’s behaviour.

You will need to manually assign controls to the XE‑1AP inputs in MAME to get a good experience, due to the number of controls and how different the layout is to common controllers. Here are some tips:

  • The XE‑1AP plugs straight into controller ports on the Mega Drive, X68000, and FM Towns.
  • For the PC Engine, you need to plug it in via the XHE‑3 PC joystick adapter. It’s the default peripheral for the XHE‑3, as it’s the most useful thing you can connect to it.

  • X68000 games expect the XE‑1AP to be connected to the first joystick port. Keep in mind that in MAME, the X68000 mouse counts as a “player”, so the XE‑1AP on the first joystick port will use player 2 controls by default.
  • Most Mega Drive games expect the XE‑1AP to be connected to the first controller port. Night Striker for Mega‑CD is an exception – you need to connect the XE‑1AP to the second controller port, and use a regular control pad connected to the first controller port to select analog mode.
  • FM Towns games require you to connect the XE‑1AP to the second controller port, and use a Towns Pad connected to the first controller port to navigate the menus and select analog controls. CSK Research Institute games require you to change the XE‑1AP’s Interface setting to MD in MAME’s Machine Configuration menu.
  • A lot of games only detect controllers on start, so if you switch the Mode from Analog to Digital or vice versa and controls stop working, you might need to restart the emulated software.

Assuming you have fairly standard settings, here are some example command lines for starting different games with the XE‑1AP:

% mame x68000 -joy1 xe1ap shangon
% mame megadriv -ctrl1 xe1ap smgp2
% mame fmtowns -ramsize 2M -pad1 townspad -pad2 xe1ap aburner3
% mame pce -ctrl pcjoy -cartridge scdsys -cdrom forgottnj

For what it’s worth, here’s my take on some of the games I tested:

After Burner II for X68000, Mega Drive and PC Engine
It’s definitely a lot easier to hit oncoming planes with your Vulcan guns with precise control of your flight attitude. It feels like a lot of the controller’s features go unused. Note that for the X68000 version, you need to press OPT.1 on the keyboard (in MAME, this is assigned to PrtScr by default) at the title screen to enable joystick controls. There’s nothing on the screen to tell you to do this, and no visual indication when joystick controls have been enabled.
Ayrton Senna’s Super Monaco GP 2 for Mega Drive
This one feels like it’s been thought out very well: the shoulder buttons are your paddle shifters (left to shift down, right to shift up), you control the accelerator and brakes with the stick using your left thumb, you rotate the throttle to be horizontal and use it to steer with your right thumb, and you press E1 or E2 with your left thumb to make a pit stop. You’ll need to change your input assignments in MAME to make this game playable. With an Xbox-style controller, you can assign the X axis of the right stick to the throttle to get close to the original setup. It feels much better with analog steering – staying on the track is a lot easier.
Thunder Blade for X68000
With analog controls, you get the best possible experience in this game. You can hover and land, just like you could in the arcade version. Assigning the throttle to the left trigger on an Xbox controller lets you fly with your left thumb and index finger, leaving your right thumb and index finger free for controlling weapons. If you assign the throttle to “Joy 1 LT -” (squeeze and release the trigger three times when assigning the input), releasing the trigger hovers and squeezing it flies forward. Assign the right trigger to B or B’ to fire the cannon, and and assign a face button to A or A’ to fire missiles.
Super Hang-On for X68000
Nothing surprising, just responsive analog controls. This game lends itself well to assigning the left and right triggers on an Xbox controller to the brakes and throttle – they even correspond to the positions of the brake lever and throttle twist grip on a motorbike. The correct assignment is “Joy 1 LT Joy 1 RT Reverse” (squeeze and release the left trigger once, then squeeze and release the right trigger four times when assigning the input). Remember not to assign the triggers to other buttons at the same time, or you won’t be able to select stages and music with the brake.
Taito Chase H.Q. for FM Towns
For me, analog steering took this game from being almost unplayable to great fun. If you prefer to use analog triggers or pedals for the accelerator and brakes, see the notes about Super Hang-On for how to assign the throttle. The turbo boost is button D, which makes perfect sense as the throttle thumb button on an XE‑1AJ, but doesn’t make quite as much sense as the lower left shoulder button on an XE‑1AP. If you’re using a thumb stick for the accelerator and brakes, it’s probably most natural to assign it to clicking the stick down, or maybe the (upper) shoulder button on the same hand as the stick you’re using. If you’re using triggers for the accelerator and brakes, using a face button for turbo boost might be a good choice. Remember to change the Interface back to Personal Computer in MAME’s Machine Configuration menu if you previously changed it to MD to play a CRI game.
After Burner III for FM Towns
This is a CRI game, so you need to change the Interface to MD in MAME’s Machine Configuration menu, but it’s worth it, because this is a highlight! It doesn’t even pretend to be accurate flight simulation, but you wanted accurate simulation, you wouldn’t be playing an After Burner game. You can roll all over the place, and it’s an awesome feeling to hold C or D, slam the throttle, and watch the After Burner Level gauge fill while you shake off a bogie on your six. It’s also got more variety than After Burner II, with stuff like ground attack levels where you plink tanks.
Operation Wolf for PC Engine
This game actually does tell you on the title screen when it’s recognised an analog controller. It gives you absolute position aiming using the stick. Definitely a lot faster and more accurate than moving the reticle around with a D-pad.
Forgotten Worlds for PC Engine Super CD-ROM²
Remember to insert the Super CD-ROM² System HuCard as well as the Forgotten Worlds CD. You’ll need to assign a key to the Run button on the joystick adapter, because the Super CD-ROM² System software won’t recognise the Start button on the XE‑1AP, and you’ll need to press it to start the software on the CD. This is the other PC Engine game that lets you know it’s detected an analog controller on the title screen. This one was a really pleasant surprise for me. The analog movement and aiming controls felt beautiful. It probably helps that it has pretty graphics and a lovely CD soundtrack, too. Load times are noticeable, but not long enough to be irritating.
Out Run for PC Engine
As with other racing games, it’s easier to stay on the track with analog steering. Using the analog throttle for the accelerator while using buttons for the brakes was an interesting design choice – you can press any of the A/B/C/D/E buttons to brake. If you decide to reassign the throttle to an analog trigger or pedal on an Xinput controller, you’ll need to assign it to the negative half of the axis. For the right trigger on an Xbox controller, squeeze and release the trigger until it shows “Joy 1 RT -” (three times) when assigning the input, and make sure you don’t have the same trigger assigned to any of the buttons.
Thunder Blade for PC Engine
This one’s a disappointment – it’s no different to using a digital game pad. Move the stick a little and you don’t move; move it a bit further, hit the threshold, and you’re moving at full speed. Also, the weapon controls are reversed compared to the X68000 version, with A firing the cannon and B firing missiles. I guess it’s nice of them to make it recognise the controller in analog mode, but it would have been even nicer if they’d made better use of it.
Musha Aleste for Mega Drive
This game is infamous for being unplayable with the XE‑1AP in analog mode. The stick controls your absolute position on the screen, which really doesn’t work for a shooter like this – it’s far too twitchy. Surprisingly, the game actually becomes playable if you assign mouse axes to control the stick, feeling similar to PC shooters like Raptor: Call of the Shadows.

XE‑1AP support isn’t the only controller-related improvement coming in MAME 0.251 – other things that made it in this month include:

  • Pluggable controllers for the Sega Mega Drive and SG‑1000 families.
  • Pluggable controllers for the NEC PC‑6001, PC‑8801 and PC‑88VA families.
  • Pluggable controllers for the Sharp MZ‑800, MZ-2500 and X68000 families.
  • Sega Mouse (2-button) and Mega Mouse (4-button) support for the Mega Drive family.
  • Sega Tap/4-Player Adaptor/Team Player support for the Mega Drive family.
  • Support for an ATmega-based paddle controller that works with export versions of the Sega Master System.
  • Mouse support for the PC Engine family.
  • Support for the Konami Hyper Shot controller (although it’s somewhat pointless in emulation).
]]>
https://rants.vastheman.com/2022/12/24/kabutogani/feed/ 0
Con cua or con ghẹ? https://rants.vastheman.com/2021/12/25/cuaghe/ https://rants.vastheman.com/2021/12/25/cuaghe/#comments Fri, 24 Dec 2021 20:31:07 +0000 https://rants.vastheman.com/?p=351 There are two Việt words for crabs that you might hear frequently, cua and ghẹ, but there seems to be some confusion over the difference. Google Translate unhelpfully renders both of them as “crab” when translating to English. Well, cua refers to crabs in general, and not just true crabs (brachyura), but also the other crab-like crustaceans like hermit crabs (anomura) – if it’s a crab, it’s con cua.

Ghẹ refers to swimming crabs (portunidae) – crabs that have hind legs that are flattened to form swimming paddles. Well-known ghẹ include the ghẹ xanh (portunus pelagicus, the blue swimmer), the ghẹ dĩa or ghẹ đỏ (portunus haanii, the red warty swimming crab), and the ghẹ chấm (portunus trituberculatus). Note that not all crabs with swimming paddles are called ghẹ – a crab must live in the ocean or estuaries and have the characteristic angular carapace and long, narrow claws (chelae) to be called con ghẹ. The mud crabs scylla serrata (cua bùn) and scylla paramamosain (cua xanh) are called cua because they live in fresh water, their claws are large, and their carapaces are rounder.

There are other Việt words for specific kinds of crabs, including rạm or đam (varunidae), dã tràng (sand bubbler crabs), cáy (ocypodidae), cà ra (Chinese mitten crabs), and cúm núm (calappidae, from cua khúm núm, due to the way they appear to hide their faces – Google Translate amusingly renders “cúm núm” as “nipple flu” in English). Like ghẹ, these aren’t strict taxonomic or phylogenetic classifications – they’re common names based on appearance (morphology) and habitat.

Ghệ or ghẹ can also be used informally to refer to women, as in “ghẹ mới của tao” (my new girlfriend), but this is a little vulgar and should be avoided in polite company.

]]>
https://rants.vastheman.com/2021/12/25/cuaghe/feed/ 1
MSVC C++ ABI Member Function Pointers https://rants.vastheman.com/2021/09/21/msvc/ https://rants.vastheman.com/2021/09/21/msvc/#comments Tue, 21 Sep 2021 11:40:49 +0000 https://rants.vastheman.com/?p=337 This is a detailed discussion of MSVC C++ ABI pointers to member functions, including some of the trade-offs they make. The format of pointers to member functions in C++ is implementation-defined, and a lot of incomplete and misleading information about this topic can be found around the web. There’s a popular article at Code Project that provides a reasonable overview of a several implementations of pointers to member functions, but not all the details are accurate. Raymond Chen has a very incomplete and somewhat misleading blog post. As far as I know, there is no publicly available formal specification for the MSVC C++ ABI.

In this article, MSVC C++ ABI with default options is assumed unless stated otherwise. Some of the values assumed to be int may actually be long int or int32_t, but without access to an LP32 or LP64 target using the MSVC C++ ABI it’s impossible to confirm one way or the other. This description is mainly based on the behaviour of MSVC 19.29 for x86-64 and AArch64. The usual disclaimers about implementation-defined behaviour apply: depending on this behaviour produces code that is not portable, details may change at any time, and the accuracy is limited by my understanding.

Casting pointers to member functions

According to the C++ standard, pointers to base class member functions may be cast to pointers to pointers to derived class member functions with the same signature for non-virtual bases only. Casting member function pointers across virtual inheritance relationships is forbidden. This rule simplifies implementation of member function pointers by avoiding the need to obtain this pointer offsets to virtual bases from the virtual table at the time a pointer to a member function is called. (You can still call a pointer to a member function of a virtual base class by casting the object to a reference to an instance of the virtual base class. The virtual base class offset is obtained from the virtual table as part of the cast to the base class, not as part of the pointer to member function invocation.)

By way of example, assume the following declarations:

class a { };

class b : public a { };

class c : public virtual a { };

using afunc = void (a::*)(int);
using bfunc = void (b::*)(int);
using cfunc = void (c::*)(int);

In standard C++ it is legal to cast a value of type afunc to type bfunc because the class a is a non-virtual base of the class b. On the other hand, it is not legal to cast a value of type afunc to cfunc because the class a is a virtual base of the class c.

As an extension to the C++ standard, MSVC does allow casting a pointer to a base class member function to a pointer to a derived class member function for virtual base classes. With the declarations from the example above, MSVC allows casting a value of type afunc to cfunc. This is the cause of some of the complications in the implementation of pointers to member functions in the MSVC C++ ABI.

Pointer to member function representations

As an optimisation, there are four different pointer to member representations used in different situations. I call them “single inheritance”, “multiple inheritance”, “virtual inheritance” and “unknown inheritance”. There are options to change the way the compiler selects a member pointer representation, for example MSVC has /vmb, /vmg, vmm, /vms and /vmv command-line options and a #pragma pointers_to_members directive. Unless otherwise noted, the rules described here assume /vmb or #pragma pointers_to_members(best_case) is in effect.

Single inheritance

Pointers to member functions of classes with single inheritance are equivalent to this structure:

struct {
    uintptr_t   ptr;    // function pointer
};

This representation is the same size as a non-member function pointer. This makes it efficient to store, copy or pass as a function parameter, as it can usually fit in a single address register or general-purpose register.

This representation is used when either:

  • The class definition is available, neither the class nor any of its direct or indirect base classes has any virtual base classes, neither the class nor any of its direct or indirect base classes has more than one base class, and neither the class nor any of its direct or indirect base classes declares any virtual member functions while deriving from a base class that has no virtual member functions.
  • The class definition is not available and a forward declaration of the class with the __single_inheritance qualifier is available.

MSVC will use this representation for all pointers to member functions with the /vmg and /vms options or the #pragma pointers_to_members(full_generality, single_inheritance) directive in effect. In this situation, declaring a pointer to a member of a class with multiple direct base classes, a class with virtual base classes or a class with virtual member functions that derives from a class with no virtual member functions results in “error C2287: ‘c’: inheritance representation: ‘single_inheritance’ is less general than the required ‘multiple_inheritance’” where “c” is the name of the class.

This minimal representation can be used because two assumptions can be made:

  • With non-virtual single inheritance, the base class (if any) always appears at the start of the class. A pointer to an instance of the class will not require adjustment when cast to or from a base class. Therefore, a pointer to a base class member function will not require this pointer adjustment when called.
  • For virtual member functions, the compiler will generate an out-of-line stub that fetches the appropriate virtual table entry and jumps to it.

It is possible to invoke a member pointer using this representation without access to the class definition. Performance for calling this representation is similar to calling a non-member function pointer for non-virtual member functions. For virtual member functions, there is an additional fetch and indirect branch. However, there are no conditional branches involved, which avoids performance penalties on deeply pipelined and/or highly parallel processors.

Multiple inheritance

Pointers to member functions of classes with multiple inheritance are equivalent to this structure:

struct {
    uintptr_t   ptr;    // function pointer
    int         adj;    // this pointer displacement in bytes
};

Note that on typical architectures, pointers and pointer-sized integers have natural alignment and int is no larger than a pointer, so the overall size is twice the size of a pointer. On typical LLP64 targets (including Windows on x86-64 and AArch64), the structure has four padding bytes for a total size of sixteen bytes.

This representation is used when either:

  • The class definition is available, neither the class nor any of its direct or indirect base classes has any virtual base classes, and the class or one of its direct or indirect base classes has at least two base classes.
  • The class definition is available, neither the class nor any of its direct or indirect base classes has any virtual base classes, and the class or one of its direct or indirect base classes declares at least one virtual member function while deriving from a base class that has no virtual member functions.
  • The class definition is not available and a forward declaration of the class with the __multiple_inheritance qualifier is available.

MSVC will use this representation for all pointers to member functions with the /vmg and /vmm options or the #pragma pointers_to_members(full_generality, multiple_inheritance) directive in effect. In this situation, declaring a pointer to a member of a class with at least one direct or indirect virtual base class results in “error C2287: ‘c’: inheritance representation: ‘multiple_inheritance’ is less general than the required ‘virtual_inheritance’” where “c” is the name of the class.

The this pointer displacement is necessary for the purpose of casting a pointer to a non-virtual base class member function to a pointer to a derived class member function with the same signature. The offset of the base class within the derived class is calculated when the pointer is cast, and applied (added to the this pointer) when it is invoked.

It is possible to invoke a member pointer using this representation without access to the class definition. This representation has twice the space cost of the single inheritance representation, but minimal additional performance cost to invoke – just one additional integer fetch and addition.

Virtual inheritance

Pointers to member functions of classes with virtual inheritance are equivalent to this structure:

struct {
    uintptr_t   ptr;    // function pointer
    int         adj;    // this pointer displacement in bytes
    int         vindex; // byte offset to base class offset in virtual table
};

Note that in the LLP64 data model, the two int members fit into the size of a pointer, so this representation has the same size as the multiple inheritance representation on typical LLP64 targets (including Windows on x86-64 and AArch64).

This representation is used when either:

  • The class definition is available, and either the class or at least one of its direct or indirect base classes has at least one virtual base class.
  • The class definition is not available and a forward declaration of the class with the __virtual_inheritance qualifier is available.

There is no combination of options or directives that will cause MSVC to use this representation for all pointers to member functions.

The virtual table index is necessary for the purpose of casting a pointer to a member function of a virtual base class to a pointer to a derived class member function with the same signature. The virtual table for the derived class contains offsets to all virtual base classes. The location of the offset to the virtual base class in the virtual table is populated when the pointer is cast; the offset is fetched from the instance’s virtual table and applied when the pointer is invoked, in addition to applying the this pointer displacement stored in the member function pointer directly.

It is not possible to invoke this representation of a pointer to a member function without access to the class definition – attempting to do so results in “error C2027: use of undefined type ‘c’” where “c” is the name of the class that was forward declared with the __virtual_inheritance qualifier. This requirement comes from a combination of two factors:

  • Structure layout rules mean that the virtual table pointer is not necessarily at the location the this pointer points to. (The virtual table pointer may not be at the location the this pointer points to in some situations where the first base class has no virtual member functions or virtual bases, but a virtual table pointer is inherited from another base class. It’s very rare to actually encounter such a case in practice.)
  • Invoking this representation requires access to the virtual table pointer, and hence knowledge of the offset to the virtual table pointer from the location the this pointer points to. This requires the base classes to be known.

The offset to the virtual base class and the this pointer displacement must be interpreted relative to the location of the virtual table pointer, which is not necessarily the location the this pointer points to. In pseudocode, the sequence for invoking this representation looks like this:

vptr = this[vadj]
this += vadj + vptr[vindex] + adj
CALL ptr

The offset to the virtual base class found in the virtual table will always be zero if the pointer does not represent a pointer to a member function of a virtual base class. In standard-conforming code, this will always be the case, as casting across virtual inheritance relationships is not permitted.

Compared to the multiple inheritance representation, this representation requires two additional fetches (the virtual table pointer and offset to the base class), at least two additional integer additions (the offset into the virtual table and the offset to the base class), and possibly a third addition of a constant (the offset to the virtual table pointer from the location the this pointer points to). On most architectures, some of these additions are implicit in addressing modes for the fetches. This representation still avoids the need for conditional branches: because the class is known to have a virtual table and the location of the virtual table pointer within the object is known, the offset to the virtual base can be fetched and added unconditionally even if it will be zero in most cases.

Unknown inheritance

Pointers to member functions of classes with unknown inheritance are equivalent to this structure:

struct {
    uintptr_t   ptr;    // function pointer
    int         adj;    // this pointer displacement in bytes
    int         vadj;   // offset to vptr or undefined
    int         vindex; // byte offset to base class offset in vtable or zero
};

Note that on typical LLP64 targets (including Windows on x86-64 and AArch64), the structure has four padding bytes for a total size of twenty-four bytes.

This representation is used when the class definition is not available, and the forward declaration of the class has no __single_inheritance, __multiple_inheritance or __virtual_inheritance qualifier.

MSVC will use this representation for all pointers to member functions with the /vmg and /vmv options, the /vmg option without the /vms or /vmm options, or the #pragma pointers_to_members(full_generality, virtual_inheritance) directive in effect.

If the virtual table index is non-zero, the offset to the virtual table pointer is added to the this pointer, and the offset to the base class is obtained from the virtual table and added to the this pointer. After this, the this pointer displacement is added to the (possibly already adjusted) this pointer. In pseudocode, the sequence for invoking this representation looks like this:

IF 0 != vindex:
    vptr = this[vadj]
    this += vadj + vptr[vindex]
ENDIF
this += adj
CALL ptr

It is possible to invoke this representation of a pointer to a member function without access to the class definition. Invoking this representation of a pointer to a member function requires a conditional branch and the associated performance penalties on deeply pipelined and/or highly parallel processors. This is necessary because without the class definition, it is not possible to know whether the class has a virtual table at all, and hence it may not possible to provide an offset to a zero value in the virtual table when an offset to a virtual base class is not required.

Comparison to Itanium C++ ABI

The Itanium C++ ABI is currently one of the most popular C++ ABIs, despite the market failure of the Itanium CPU architecture. The Itanium C++ ABI has been widely adopted on UNIX-like systems and by Open Source/Free Software development tools. Exact details vary by architecture, but conceptually the Itanium C++ ABI always represents pointers to member functions as a tuple containing three values:

  • A union containing a function pointer or virtual table index
  • A displacement to apply to the this pointer
  • A flag to discriminate between a function pointer or a virtual table index

Disadvantages compared to the MSVC C++ ABI include:

  • No provision for obtaining an offset to a virtual base class, so casting pointers to members of virtual base classes to pointers to members of derived classes cannot be supported
  • Pointers to member functions are always larger than non-member function pointers, even in the simplest cases
  • A conditional branch is required to invoke any kind of member function pointer in order to handle either a function pointer or a virtual table index

Advantages over the MSVC C++ ABI include:

  • All member function pointers types are the same size and can be invoked in the same way
  • Layout rules mean the virtual table pointer will always be at the location the this pointer points to if present, so there is no need to account for the offset to the virtual table pointer
  • If a pointer to a virtual member function is to be called repeatedly, it is simple to resolve the function address and avoid repeated virtual table fetches and additional indirect branches

Calling conventions

The proliferation of incompatible calling conventions for 32-bit i386 or i686 targets is well-known. For member functions, explicit arguments are pushed onto the stack in right-to-left order, the this pointer is passed in register ECX, and the called function removes the arguments from the stack on return. However, it is widely assumed that on x86-64, member functions are equivalent to non-member functions with the this pointer as an implicit first parameter. This is not true. The MSVC C++ ABI for Windows uses a subtly different calling convention for member functions on both x86-64 and AArch64.

This is not a comprehensive discussion of Windows calling conventions on x86-64 and AArch64. It’s intended to be just detailed enough to highlight the differences between non-member functions and member functions.

Non-member functions

Both x86-64 and AArch64 pass parameters and return results in registers. However, only scalar types (integers, floating point types, pointers and enumerated types), references, and small aggregate structures and unions (trivially constructible, destructible, copyable and assignable) may be returned in registers. In cases where the return type may not be returned in a regsiter, the caller allocates space for the return value (typically on the stack) and passes a pointer to the area for the return value as an implicit parameter.

On x86-64, register RCX is usually used for the first integer or pointer argument. However, if the return type cannot be returned in a register, the pointer to the area for the return value is passed in register RCX and explicit parameters are shifted by one position:

Return value in register Pointer to return value area
RCX = first integer/pointer argument
RDX = second integer/pointer argument
R8 = third integer/pointer argument
RCX = pointer to return value area
RDX = first integer/pointer argument
R8 = second integer/pointer argument

On AArch64, registers X0 to X7 are used for integer or pointer arguments. If the return type cannot be returned in a register, the pointer to the area for the return value is passed in register X8, which would otherwise be a volatile register with no special significance. Explicit parameters do not need to be shifted:

Return value in register Pointer to return value area
X0 = first integer/pointer argument
X1 = second integer/pointer argument
X2 = third integer/pointer argument
X0 = first integer/pointer argument
X1 = second integer/pointer argument
X2 = third integer/pointer argument

X8 = pointer to return value area

Member functions

There are three key differences in the calling convention for member functions:

  • The this pointer is passed as an implicit first parameter.
  • Structure and union type values are never returned in registers.
  • The pointer to the return value area for structure and member types is passed as an implicit second parameter after the this pointer.

Note that for scalar types that may not be returned in registers, the pointer to the result area is passed in the same way it would be for a non-member function. An example of a type returned this way is a pointer to a member function of a class with unknown inheritance: it is a pointer, and hence a scalar type, but with a size of twenty-four bytes it is too large to return in registers.

For x86-64, these are the three possible situations on entry to a member function – note that when a scalar value cannot be returned in a register, the this pointer is shifted by one position:

Return value in register Pointer to return value area (scalar) Pointer to return value area (structure/union)
RCX = this pointer
RDX = first integer/pointer argument
R8 = second integer/pointer argument
RCX = pointer to return value area
RDX = this pointer
R8 = first integer/pointer argument
RCX = this pointer
RDX = pointer to return value area
R8 = first integer/pointer argument

For AArch64, these are the three possible situations on entry to a member function – note that the this pointer is always in X0:

Return value in register Pointer to return value area (scalar) Pointer to return value area (structure/union)
X0 = this pointer
X1 = first integer/pointer argument
X2 = second integer/pointer argument
X0 = this pointer
X1 = first integer/pointer argument
X2 = second integer/pointer argument

X8 = pointer to return value area
X0 = this pointer
X1 = pointer to return value area
X2 = first integer/pointer argument

Why the difference?

The different calling convention for member functions on x86-64 has been in place since MSVC added support for the architecture. AArch64 seems to follow x86-64 by analogy.

Initially I thought the different calling convention for member functions was to ensure the this pointer would always be in the same register for convenience. That was before I realised there are situations where this is not the case, and there are different rules for which types may be returned in registers.

I can only speculate as to what the reasoning behind the decision to use a different calling convention was. It may simplify interoperability with some other language, or it may simplify COM implementation in some way.

The problems for delegates

Unless you’re writing assembly language code or a compiler that generates it (or debugging a low-level issue), the real motivation for getting into the gory details of member function pointer implementations almost always comes back to the desire to implement fast delegates. Invoking pointers to member functions can be slower than invoking pointers to non-member functions, and mitigating that is a common goal.

The MSVC ABI presents three major problems for the purpose of implementing fast delegates without limiting developers:

  • It is not practical to distinguish between the multiple inheritance and virtual inheritance representations of pointers to member functions in a template on LLP64 platforms. It’s simple to distinguish between the single inheritance, multiple inheritance and unknown inheritance representations using the result of the sizeof operator. However, the multiple inheritance and virtual inheritance representations have the same size on LLP64 platforms due to alignment and padding requirements. There’s no standard type trait for determining whether a class has at least one direct or indirect virtual base, and as far as I know there’s no MSVC extension for doing so either.
  • There’s no standard way to obtain the location of the virtual table pointer within an object. In certain situations, the virtual table pointer will not be at the location the this pointer points to. There’s no standard way to obtain the offset to the virtual table pointer, and as far as I know there’s no MSVC extension for obtaining it, either. This makes it impossible to safely support the virtual inheritance representation even on platforms where it can be detected reliably.
  • The subtle difference in calling conventions for non-member and member functions means that it is not possible to convert a pointer to a member function to an equivalent pointer to a non-member function if it returns a structure or union value that is not trivially default constructible and destructible. There is no equivalent non-member function signature that will cause the result value to be constructed in the correct location. For trivially constructible and destructible types, the area for the return value can be treated as a reference parameter following the this pointer. Using this approach requires a temporary variable that the compiler might not elide, and if your delegate implementation supports non-member functions as well as member functions, a conditional branch is required to select the correct equivalent non-member function signature before the call. This causes a performance penalty for all calls, working against the original goals of designing a fast delegate type.

In practice, many developers just naïvely assume the virtual table pointer can be found at the location the this pointer points to, even though this isn’t guaranteed by the layout rules. It’s possible to work around the differing calling conventions by instantiating an adapter function when binding a delegate to a member function that returns a structure or union by value, but there are real-world delegate implementations that don’t do this. There are also real-world delegate implementations that ignore the differences between the multiple inheritance and virtual inheritance representations of member function pointers.

How do they get away with it?

So how do delegate implementations that don’t account for these seemingly insurmountable issues work at all? Well it actually turns out that situations that trigger the issues don’t come up as frequently as you might expect. Even if you aren’t being careful to avoid the problematic code, a combination of several factors means you may not ever encounter the issues:

Virtual inheritance is used sparingly
Several of the issues only come up when virtual inheritance is involved. Virtual inheritance is one of the less-frequently used C++ features. It has a space penalty, it adds indirection to base class member accesses, and it complicates base class construction. It’s only used when it’s really necessary. You may never need to write a class with any virtual bases, and even if you do, you may not need to use it with delegates. If you don’t use classes with virtual bases, you won’t get a situation where you need to find the virtual table pointer and fetch a value from the virtual table in order to invoke a pointer to a member function.
Classes where the this pointer doesn’t point to the virtual table pointer are rare
To make this happen, you need specific conditions involving a class with multiple base classes where the first non-empty base class has no virtual member functions and no direct or indirect virtual base classes, but the class inherits a virtual table pointer from another base class. It’s very rare to write a class that meets the requirements and has at least one virtual base class by coincidence. For example in many real-world cases, classes inherit a virtual destructor or a virtual base class via their first base class. This means assuming the virtual table pointer can be found at the location the this pointer points to rarely causes issues in practice.
Casting pointers to member functions across virtual inheritance relationships is non-standard
The offset to the virtual base class obtained from the virtual table when invoking a pointer to a member function will only be non-zero if the pointer represents a pointer to a function member of a virtual base class that has been cast to a pointer to a member of a derived class. Since this is not permitted by the C++ standard, it will never happen in portable code. Conveniently, this extension to the standard cannot be supported with the Itanium C++ ABI, so any code that uses it will fail to compile in many configurations (e.g. MinGW GCC on Windows x86-64, pretty much any Linux configuration, or macOS). The extension is unlikely to be useful in conjunction with delegates because the instance can be cast to a reference of the virtual base class when setting the delegate rather than casting the member function pointer. This means a situation where the offset to the virtual base class must be obtained from the virtual table when invoking a pointer to a member function is impossible in most portable code, and highly unlikely in code only built with the MSVC C++ ABI, especially considering the sparing use of virtual inheritance.
Functions return scalars more frequently than structures
The majority of functions return void or some kind of scalar value. Functions returning references (especially const references) are quite common, too. The different calling convention for member functions doesn’t affect functions that return void, scalars or references. Simply not using delegates with member functions that return structures or unions can be used as a workaround to avoid having to deal with the different calling convention used for member functions. You can also use a trait to prevent a delegate from being instantiated for pointers to member functions returning structure and union types.

These factors combine to allow code to work most of the time when various implementation difficulties are ignored.

Updates

  • Updated on 6 June 2023 to correctly cover the case where a class declares at least one virtual member function while deriving from a single base class with no member functions (thanks to ykiko for pointing out this error).
]]>
https://rants.vastheman.com/2021/09/21/msvc/feed/ 6
Lifting a jinx? https://rants.vastheman.com/2019/08/13/jinx/ https://rants.vastheman.com/2019/08/13/jinx/#comments Tue, 13 Aug 2019 10:13:17 +0000 https://rants.vastheman.com/?p=332 The past decade has seen a substantial increase in rail freight in Australia. Capital investment like the Southern Sydney Freight Line and conversion of the Victorian North East line to 1435 mm standard gauge is paying off. Allied Pinnacle has a siding just south of Kensington station, and Southern Shorthaul Railroad (SSR) currently has the contract to deliver their wheat. SSR uses a pair of S class “bulldog nose” locomotives from the 1950s coupled back-to-back to operate this service. Right now they’re using S302 and S317, but they were using S312 earlier in the year.

S302 is named after the pioneer Edward Henty. Here it is at Kensington, in orange and grey livery:

S302 at Kensington

S317 was the last S class locomotive to be delivered, and was named after Sir John Monash. S317 has been involved in two major accidents: it was rebuilt after a head-on collision with X33 in 1967, and collided with the read of the Spirit of Progress at Barnawartha in 1982 killing the crew. Here it is at Kensington:

S317 at Kensington

Wait! What’s that below the cab window? It doesn’t have the old Sir John Monash nameplate any more. It seems to be named Jenny Molloy now! Who’s Jenny Molloy? Whoever she is, she definitely isn’t as well-known as Sir John Monash. She does have the same initials, though. Here’s a view of the nose:

S317 at Kensington

Interesting paintwork inside the horns, too. If anyone knows more about the renaming of S317 or who Jenny Molloy is, I’d love to hear about it. Did the old name have too many bad connotations? Was the new name intentionally selected to have the same initials? Does the new name lift a jinx?

]]>
https://rants.vastheman.com/2019/08/13/jinx/feed/ 3
The Q Factor https://rants.vastheman.com/2018/03/20/qfactor/ https://rants.vastheman.com/2018/03/20/qfactor/#comments Tue, 20 Mar 2018 11:27:27 +0000 https://rants.vastheman.com/?p=315 QSound logo screenIf you spent much time in video game arcades ’90s, you’re sure to have seen the QSound logo proudly displayed during the attract loop of games running on Capcom’s CPS-2 and ZN-1/ZN-1 hardware platforms, and heard the distinctive jingle. But What exactly is QSound? What does it actually do? Capcom was definitely keen to promote its inclusion, but did it really give an edge in any area besides marketing? Was it worth the licensing fees they undoubtedly paid, and the precious attract loop time they spent announcing it?

As implemented in Capcom’s arcade systems, QSound technology is provided by a digital signal processor (DSP) running an internal program that implements a sixteen-channel sample player/mixer. It produces 16-bit stereo output at just over 24 kHz. It supports 16-bit samples, but Capcom only ever used it with 8-bit sample ROMs. In addition to playing and mixing the channels, it applies “spatialisation” effects, intended to give the impression of a more expansive virtual sound stage. I know what you’re thinking: everything I’ve said so far sounds like marketing material, but what do the effects actually do, and does it actually work?

The most prominent effect you’ll hear is the way the QSound DSP handles stereo panning. Conventionally, panning just reduces the volume on one channel or the other. If you pan a sound hard left, it’s sent to the left speaker at full volume and silenced on the right speaker. As you pan towards the centre, the volume on the right speaker increases until you reach the centre position where the volume is equal on left and right speakers. Panning past the centre to the right reduces the volume on the left speaker.

Rather than simply adjusting the level, the QSound produces two components for each stereo output: one sent directly to the speaker, and the other passed through a digital filter. Panning controls the volume ratio of the component sent directly to the speaker and the filtered component. For the left output, when a sound is centred, the sound is sent directly to the speaker only. As you pan to the centre, the amount sent through the filter increases until you reach the centre, where the filtered component is about 2 dB lower than the direct component. Panning further to the right continues to increase the volume of the filtered component, and reduces the volume of the component sent directly to the speaker. When panned hard, it’s sent through the filter only. For what happens with the right stereo output, just reverse the directions. The pan tables are illustrated in the graph below.

QSound pan

That’s all well and good, we know that panning controls the ratio of filtered to unfiltered output rather than just adjusting volume, but what do the filters themselves do? Let’s start by looking at the filter used for the left stereo output:

QSound right-to-left

It has a fairly flat passband up to about 1 kHz, falls to about -5 dB at 2 kHz, a small peak at 3 kHz, and falls off rapidly from 5 kHz to the stop band at 6 kHz. Can you guess what this is for yet? This filter is supposed to simulate the sound your left ear hears from a sound source to your right. Low frequencies tend to diffract around obstacles, so they don’t have much directionality, midrange frequencies pass through your head fairly effectively, and high frequencies are basically blocked by your head. This is designed to give a more realistic impression of the position of a source on the virtual sound stage than simple panning.

The filter for the right stereo output, simulating what your right ear heard from a sound source on your left, is similar but not quite the same (perhaps the asymmetry is supposed to increase the realism):

QSound left-to-right

This is not the only effect applied by the QSound DSP, it’s just the most prominently audible one. It leaves the question of whether it actually works as intended. In my opinion, it works reasonably well if used effectively, provided the speaker setup has reasonable stereo separation, and the listener is in the sweet spot – easily achievable with headphones or a home stereo, but not in a noisy arcade with poor cabinet acoustics, not so much. Maybe it would work OK with a something like a Sega Blast City cabinet, specifically designed for stereo output. Some of the other QSound effects are less dependent on stereo separation, so they work better in typical arcades.

]]>
https://rants.vastheman.com/2018/03/20/qfactor/feed/ 1
TPG FTTB Settings https://rants.vastheman.com/2018/01/24/settings/ https://rants.vastheman.com/2018/01/24/settings/#comments Wed, 24 Jan 2018 06:52:16 +0000 https://rants.vastheman.com/?p=306 In case anyone else wants to configure third-party equipment for a TPG fibre-to-the-building service, here are the details. Below the fold are screenshots of the settings entered in the web-based configuration UI of an AVM FRITZ!Box. Note that the SIP password is not the same as your account password, and you’ll need to obtain this somehow. TPG doesn’t make this easy, but it is possible.

Internet connection

Modulation: VDSL2 17a (ITU G.993.2)
VLAN: 2
VPI: 1
VCI: 32
Encapsulation: PPPoE
Authentication: PAP
Username: your TPG username optionally followed by “@tpg.com.au”
Password: doesn’t matter – it isn’t actually verified (you can use your account password)

Phone service connection

Connection type: PVC
VLAN: 6
802.1q PCP tag:
(PBit or 802.1p traffic class)
5 (VO, voice with < 10 ms latency/jitter)
VPI: 1
VCI: 32
Encapsulation: routed bridge encapsulation
IPv4 configuration: DHCP

SIP connection settings

Registrar server: uni-v1.tpg.com.au
Proxy server: uni-v1.tpg.com.au
STUN server: none (disabled)
Connection type: SIP trunk
Telephone number: your telephone number including area code (ten digits)
Username: your telephone number including area code (ten digits)
Password: your SIP password (16 characters including uppercase and lowercase letters and digits)
Voice codecs: G.711 and G.729

Internet account settings for AVM FRITZ!Box

DSL settings

Telephone line settings for AVM FRITZ!Box

Line settings

Telephone number settings for AVM FRITZ!Box

SIP settings

]]>
https://rants.vastheman.com/2018/01/24/settings/feed/ 12
TPG: Just Don’t https://rants.vastheman.com/2018/01/20/tpg/ https://rants.vastheman.com/2018/01/20/tpg/#comments Fri, 19 Jan 2018 15:38:21 +0000 https://rants.vastheman.com/?p=298 Due to persistent issues with line quality, I switched an Internet connection from Internode ADSL2+ to TPG fibre to the building (FTTB). Although the connection quality is better, just about everything else about TPG is worse. I strongly recommend avoiding TPG. Problems include:

  • Error-prone signup process
  • Supplied modem/router is heavily compromised
  • Phone service is tied to compromised modem/router
  • No IPv6 support
  • Support staff very inconsistent
  • Good support staff hobbled by policy

My Internode connection had become very slow and unstable in hot, dry weather. Strangely it was fine in the rain, and even during flooding. It almost seemed like something needed to be damp to maintain an electrical connection. There’s no way to actually get these kinds of issues resolved, as the ISP and last mile provider will blame each other and the in-building wiring, and charge extortionate rates for technicians to be called out without actually solving the issues. The only other option I have for last mile is TPG. I’d been switching to Telstra LTE on bad days, and to be fair it’s actually not too bad at the moment. It seems to be pretty fast and stable, but I imagine that will get worse as more people start to use the network. But using LTE comes with a number of imitations, and it’s supposed to be my backup, not my day-to-day Internet connection.

Sadly, it seems that Internode may be going downhill since being acquired by TPG. After iiNet acquired Internode, they were allowed to operate independently for the most part. The call centre in Adelaide remained open, Internode continued to offer the same kinds of perks as before, including Usenet servers, Steam content mirrors, native IPv6 connectivity, and more. However, TPG has consolidated iiNet and Internode support and seems to be phasing out Internode perks. They’ve even started selling TPG nbn™ HFC (DOCSIS cable) under the Internode brand name, providing the same IPv4-only connection and obfuscated SIP phone service.

With the consolidation in the Australian ISP sector, there’s a big reduction in competition. TPG has merged with or acquired Soul, AAPT, Chariot, iiNet, Internode, TransACT, WestNet, PIPE, Westnet, and more. There doesn’t seem to be a good alternative at the moment. There may be an opportunity for an upstart ISP that understands what made “premium” ISPs like Internode successful in the first place.

Sign-up process

I initially tried signing up for the service through the web site, converting an existing dial-up account I’ve had for over a decade. At the end of the process, it gave me a red error message telling me there was a problem and to call customer service. Despite this, it still charged me the setup fee, and not the correct setup fee for the options I’d chosen. Also, there’s no option to choose the delivery address for the supplied modem/router through the web interface: it will always be sent to the billing address, not the service address. This means you need to get it from the billing address to the service address if they aren’t the same.

It took multiple calls to customer service over several weeks to get the incorrect setup fee refunded and get back to where I started again. The telephone support staff seem to vary substantially. Many of them don’t seem to be interested in actually getting issues resolved, and just want to read from a script. I also had support staff promise to call back, and then never do so.

After this, I tried my luck signing up over the phone. The saleswoman insisted that I needed to create a new account, and couldn’t convert my existing dial-up account over. She assured me that my existing TPG e-mail address could be transferred to the new account without any period where mail would be lost. It’s possible to specify a delivery address for the modem/router when signing up over the phone. However, after completing the sign-up process, I was transferred to support who informed me that there was no need to sign up for a new account at all, and it seems to be impossible to transfer the existing e-mail account to the new account without a multi-day period where e-mails will be lost. The call was recorded, so it’s on record that the saleswoman promised me something that they can’t deliver. This issue still hasn’t been resolved.

Supplied modem/router

TPG supplied a Huawei HG659 modem/router. This device is rather lacking in functionality. It lacks DECT base station functionality, it can’t function as a SIP gateway for multiple IP phones, it doesn’t support incoming VPN connections, and numerous other useful features are absent. On top of this, TPG supplies the device with crippled firmware. The predefined “admin” user account is limited to changing basic settings, and it’s not possible to create an account with full access. It’s possible to access some hidden settings (including authentication, encapsulation and VLAN settings) with a JavaScript debugger attack, but trying to access other settings this way drops you back to the login page. It’s completely impossible to access bandwidth settings and telephony settings, or to back up/restore settings.

The modem/router is pre-configured and has TR-069 permanently enabled on VLAN 6. This allows TPG to push configuration or firmware updates to the device at any time. This is a huge problem for stability and security. There’s no way to control if/when updates may be pushed, allowing your connections to be interrupted at any time. A poorly considered or malicious update could cause denial of service, DNS hijacking, communication interception, or a host of other issues. Flaws in TR-069 are actively exploited by the Mirai botnet as well as other malware.

TPG’s justification for this is that it makes it easy to TPG to fix configuration problems, and they make vague claims about doing it for “security” reasons. It’s true that it makes support simpler if the ISP can push out default configuration. However it comes with a massive security risk. They should acknowledge the security risks involved, and give the customer the ability to choose between convenience and security. The real motivation seems to be an effort to hide the SIP settings to prevent customers from using other SIP clients or IP phones. I really don’t understand TPG’s obsession with preventing the customer from using a SIP client of their choice.

It’s possible to put the modem/router into firmware recovery mode by holding the reset button (with a straightened paperclip) for twenty seconds, and then to load a different firmware image. However, Huawei doesn’t seem to distribute a standard firmware image, so you’d need to use a firmware image from another ISP, with its own customisations and potential security issues. If you don’t enable TR-069 after loading a different firmware image, you won’t be able to obtain the SIP settings, so the phone service still won’t be usable. However, if you do enable TR-069, TPG will just push out their firmware image along with the configuration, and you’ll be back where you started.

In summary, it’s impossible to get the modem/router into a clean state where you can fully control it and still use TPG’s phone service. The modem/router supplied by TPG must be treated as a hostile device on your network. As the customer, you can’t prevent malicious configuration or firmware updates being applied, and you can’t verify or change the device’s configuration.

Phone service inflexibility

TPG’s SIP phone service for FTTB customers is limited and inflexible. Unlike other SIP phone services, it’s only accessible from TPG’s network. The server uses the DNS name uni-v1.tpg.com.au which resolves to three private IPv4 addresses – 172.26.0.17, 172.26.0.1, and 172.26.0.65 – accessible via VLAN 6. TPG requires use of the narrowband 8 kbps G.729 voice codec, which provides poor call quality. TPG also actively works to prevent customers from using their own IP phones.

TPG refuses to supply customers with SIP connection details, only pushing them out via TR069. The SIP username and password are different from the username and password used to access e-mail and other TPG services. It seems somewhat strange and pointless to require authentication at all, since the SIP server is only accessible on a TPG connection via a specific VLAN. It would be trivial to identify the customer by the origin of the connection. It seems to be nothing more than a way to force the customer to use the compromised modem/router supplied by TPG. (TPG actually does provide SIP settings for some services on this page. The aphone1 to aphone6 servers resolve to public IP addresses, but they are only accessible from TPG connections. However, there’s nothing to indicate which customers can use these settings – it’s definitely not applicable to FTTB services.)

It was previously possible to use a JavaScript debugger attack on the supplied Huawei modem/router to back up settings, and extract the SIP settings, including the password, from the resulting file. However, new firmware made that impossible. It would be possible to buy a VDSL DSLAM, emulate the SIP server, and steal the credentials that way, but this is prohibitively expensive. It may be possible to connect to VLAN 6 with a different modem/router, use software to emulate the TR-069 client, and obtain the VoIP settings that way. It may also be possible to open the supplied modem router, solder in a serial or JTAG header, and dump the Flash filesystem. Desoldering the Flash chips and dumping the data directly is another option. All of these options are a lot of work just to be able to use a service that you pay for, without having to allow a compromised device on your network.

There’s no way to unbundle the phone service from the Internet service. So if you decide that the risk of using a compromised modem/router is too high and the workarounds are too impractical, you’re still forced to pay for a phone service you can’t use.

All this effort to prevent customers from using SIP clients other than the supplied modem/router seems rather strange. There doesn’t seem to be a technical reason for it, as the service seems to use standard protocols, and customers who’ve managed to extract the details from their modem/router haven’t had issues using other SIP clients. The lack of any plausible explanation almost seems like TPG wants to have devices they control on customers’ networks for some malicious purpose.

The decision to require G.729 seems odd as well. With ever-increasing line speeds, a 32 kbps codec like G.726 shouldn’t be a problem. In particular, G.726 would allow lossless forwarding to cordless DECT handsets. Only allowing access from TPG’s network is also artificially limiting. Packets are cheap to forward – there’s no real reason not to allow access from other networks. It can still be limited to one or two concurrent calls and/or concurrent registrations. Call quality will suffer if there’s unpredictable latency or packet loss in the path, but the customer can deal with that.

NodePhone SIP service, ironically owned by TPG, can be used from anywhere on the Internet. I’ve successfully used it from as far away as Hong Kong and Shanghai with good results. Right now I’m using a NodePhone service over my TPG FTTB connection as it’s a better option than using a compromised modem/router.

Lack of password verification

TPG requires your VDSL modem to be configured to use PAP authentication. However, the password is not verified. They assume that by being physically patched to the DSLAM port, you are authorised to use the service. This isn’t a safe assumption. In most apartment buildings, tradesmen and/or residents can easily access the main distribution frame (MDF) and change the patches. For services with the DSLAM located in a roadside cabinet or telephone exchange, there are further points along the path where a technician could unintentionally or maliciously patch the DSLAM port assigned to you to another line.

This appears to be to make support simpler. If the password is not verified, a dummy password can be used in settings pre-configured or pushed out to the customer’s modem/router via TR-069, and support staff can walk a customer through the process of setting up a modem/router without either of them having to know the password. However, it’s another security hole, and given the metadata retention laws and the aggressive behaviour of copyright holders, it’s unwise to make it in any way simpler for someone to impersonate the customer.

Lax e-mail security

TPG’s mail servers support explicit and opportunistic SSL/TLS encryption. However, as of the time of writing, TPG’s relevant support pages don’t make any mention of enabling encryption, and the step-by-step guides for Apple Mail and Android phones show settings that will result in usernames, passwords, and mail contents being transmitted in plain text.

This shows blatant disregard for customers’ security. A customer following TPG’s instructions for configuring Apple Mail or an Android phone will expose their account name and password to anyone with the ability to sniff packets between them and TPG’s mail servers. On a public WiFi network, this includes anyone in the vicinity who can use packet capture software.

No IPv6 support

TPG does not officially support IPv6 and has no timeline for IPv6 rollout. There are rumours that they’re testing IPv6 with selected customers, but there’s no way to voluntarily join the test group. IPv6 is not a new technology. RFC 2460 was published in late 1998, almost twenty years ago. Microsoft began requiring applications to work in a pure IPv6 environment (no IPv4) for logo certification beginning with Windows Vista in 2006, over ten years ago. All major operating systems and most network equipment provides IPv6 support.

TPG is really behind here. Internode (now owned by TPG) has provided dual stack IPv4/IPv6 since 2008 (ten years ago), assigning a static /56 subnet and a dynamic /64 subnet to each connection. Even Telstra, not known for being on the cutting edge, has rolled out IPv6 for NBN and ADSL customers. With iiNet, you at least have the option of using a 6rd service to provide IPv6 connectivity, although it suffers from some limitations compared to a true dual stack deployment.

Phone support

The quality of service provided by the phone support staff varies enormously. You often need to work your way through multiple people before you reach someone who seems to actually care or be interested in helping. Even then, the staff are hobbled by processes and policies, and may not be able to really do much. I’ve experienced this multiple times with the support and engineering teams. One time, the guy said something to the effect of, “Well, I understand what you’re saying, but I don’t set the policy. The call’s recorded, I’ll mark this as a complaint, hopefully someone in Sydney will actually hear it.”

There are definitely some people at TPG who seem to want to do the right thing by the customers. Ace and Joy from support, in the unlikely event that you’re reading this, I’d like you to know I think you’re great. You’ve both got back to me when you said you would, tried to understand the issues I raised, and tried to get things resolved as well as you can. It’s not your fault TPG’s policies are hostile to the customer, or some of the other people on the support team don’t seem to care.

Closing thoughts

I’ve had TPG Internet accounts for over twenty years now. Back in the dial-up days, TPG was the ISP to beat. They provided national service at competitive rates, and it just worked with no fuss. Now everything’s a nightmare. It seems TPG wants to sell to people who just use their Internet connection for Facebook and YouTube. There’s definitely a market for that, but the trouble is they’ve absorbed the ISPs who catered for people who wanted a little more, and soon there may not be any other options left. It’s sad to see the Australian ISP landscape go this way.

]]>
https://rants.vastheman.com/2018/01/20/tpg/feed/ 2
Going Old-School https://rants.vastheman.com/2017/07/08/oldschool/ https://rants.vastheman.com/2017/07/08/oldschool/#comments Fri, 07 Jul 2017 18:27:31 +0000 https://rants.vastheman.com/?p=290 For lulz, I decided to rewrite MAME’s Intel 4004 CPU core and add support for most 4040 features. The new CPU core operates at the bus cycle level, and exposes most useful signals. It also uses lots of address spaces for all the different kinds of memory and I/O it supports (thanks OG). Some CPU core bugs were fixed along the way – notably intra-page jumps on page boundaries were broken.

One nice benefit we get from this is being able to hook up the I/O for the Bally/Nutting solid-state Flicker pinball prototype (supposedly the first microprocessor-controlled pinball machine) how the hardware actually worked. I also hooked up the playfield lamp matrix as outputs and the operator adjustments as machine configuration while I was at it. We need a proper thermal model of a lamp with PWM dimming support before that part can be declared perfect. (It previously used a hack, pulling the low bits of RC out of the CPU using the state interface. This worked due to a quirk of the game program, and there was no way to implement it properly without the 4004 CM-RAM signals being exposed.)

Possibly more interestingly, we can now emulate the Intel INTELLEC® 4/MOD 40 development system. There seem to be very few surviving examples of this system, but fortunately we’ve got monitor PROM dumps, and there’s some information floating around the web. It has interesting debugging features on the front panel. There’s a scanned manual, but the schematics are very poor quality. However, with some idea of how it works, it’s possible to work out what all the chips are supposed to be. That’s the fun part. Turning it into MAME code isn’t as much fun, but it’s doable.

The front panel looks like this:

That requires clickable artwork for the switches and outputs for the LEDs to get a usable experience (writing the MAME XML layout really isn’t fun). There’s a simple monitor in PROM, designed to be used with an ASCII teleprinter with paper tape reader and punch (e.g. a Teletype Model 33 ASR). MAME’s RS-232 video terminal will have to do as a substitute for that.

If you get bleeding edge MAME (i.e. built from source no more than a day or so old), you can try it out in emulation. Did you ever wonder how developers may have debugged 4004 code in the mid to late ’70s? Well even if you didn’t, now you can find out.

The front panel looks like this in emulation (without the explanatory labels), and by default MAME shows the video terminal below this:

All those LEDs are functional, and all those switches are clickable and visually reflect their current state.

So how would you actually use it in practice? That’s where this brief instruction manual on the MAMEdev wiki comes in. It’s complete with examples of how some of the monitor commands and front panel debugging features can be used. It’s marked NOT_WORKING in MAME for now because you need to manually set up the terminal the first time you use it, and I haven’t finished implementing the universal slots and storage cards. But you can do all the things described on that page.

Does anyone care? Probably not. Will anyone actually even try this system out in MAME? Probably not (apart from Robbbbbbbert). But this is another example of something that you can only do in MAME, and how completely unrelated systems can both benefit from emulating the same chip properly. It also gets rid of one previously unavoidable hack, and gets us one step closer to feature parity with EmuAllSystems.

]]>
https://rants.vastheman.com/2017/07/08/oldschool/feed/ 1
Attacking the Weak https://rants.vastheman.com/2017/02/13/attacking/ https://rants.vastheman.com/2017/02/13/attacking/#comments Sun, 12 Feb 2017 23:37:19 +0000 https://rants.vastheman.com/?p=285 ShouTime dumped the incredibly rare game Omega (Nihon System). It’s a ball-and-paddle game running on similar hardware to Sega’s Gigas. These games use an NEC MC-8123 CPU module containing a Z80 core, decryption circuitry, and an 8 KiB encryption key in battery-backed RAM. When fetching a byte from ROM or RAM, the CPU chooses a byte from the encryption key based on twelve of the address bits and whether it’s an M1 (opcode fetch) cycle or not. This byte from the encryption key controls what permutation (if any) is applied to the byte the CPU fetches. This encryption scheme could have been brutal, requiring extensive analysis of a working CPU module to crack, if it weren’t for a fatal flaw: Sega used a simple linear congruential generator algorithm to create 8 KiB keys from 24-bit seeds. That means there are less than seventeen million encryption keys to test. Seventeen million might sound like a lot, but it’s far less than the total possible number of keys, and definitely small enough to apply a known plaintext attack in a reasonable amount of time.

So how do we go about attacking it? First we have to make an assumption about what the game program is going to be doing. Given that the hardware looks pretty similar to Gigas and Free Kick, I guessed that one of the first things the program would do is write a zero somewhere to disable the non-maskable interrupt generator disable maskable interrupts. So I wrote a program to find candidate seeds (no, I won’t show you the source code for this program – it’s embarrassingly ugly and hacky, not something I could ever be proud of):

  • Start with first possible 24-bit seed value
  • Generate 8 KiB key using algorithm known to be used by Sega
  • Decrypt first few bytes of program ROM using this key
  • If it looks like Z80 code to store zero somewhere and disable interrupts, log the seed
  • Repeat for next possible seed value until we run out of values to try

This ran in just a few minutes on an i7 notebook, and narrowed down the millions of possible seed values to just five candidates: 36DF3D, 6F45E0, 7909D0, 861226, and BE78C9 (in hexadecimal notation). Now I could have tried these in order, but it looked like Sega had made another misstep: besides using a predictable algorithm to generate the key, they also used a predictable seed value to feed this algorithm. The candidate seeds value 861226 looks like a date in year-month-day format. It turns out this seed generates the correct key to decrypt the game program, so I guess we know what someone at Sega was doing the day after Christmas in 1986.

Brian Troha hooked up the peripheral emulation, and the game will be playable in MAME 0.183 (due for release on 22 February). Colours aren’t quite right as we don’t have dumps of the palette PROMs yet, but we expect to resolve this in a future release. Thanks to ShouTime and everyone else involved in preserving this very rare piece of arcade history.

]]>
https://rants.vastheman.com/2017/02/13/attacking/feed/ 3
My PAL with the LASERs https://rants.vastheman.com/2015/12/15/lasers/ https://rants.vastheman.com/2015/12/15/lasers/#respond Tue, 15 Dec 2015 12:54:05 +0000 https://rants.vastheman.com/?p=250 Back in the distant past, MAME started cataloguing programmable logic devices (PLDs) in addition to ROMs. This was met with considerable hostility from certain segments of the community, as it was seen as forcing them to obtain files they saw as unnecessary for emulation in order to run their precious games. However PLDs are programmable devices, and it’s important to preserve them. So far though, the PLD dumps have mainly been used by PCB owners looking to repair their games. The haven’t been used by MAME for emulation. However, PLDs are key to the operation of many arcade games, performing functions like address decoding and video mixing.

One such arcade board is Zaccaria’s Laser Battle, also released under license by Midway as Lazarian. This board uses complex video circuitry that was poorly understood. It includes:

  • TTL logic for generating two symmetrical area effects, or one area effect and one point effect
  • TTL logic for for generating an 8-colour background tilemap
  • Three Signetics S2636 Programmable Video Interfaces (PVIs), drawing four 4×4 monochrome sprites each
  • TTL logic for generating a single 32×32 4-colour sprite
  • A Signetics 82S101 16×48×8 Programmable Logic Array (PLA) for mixing the video layers and mapping colours

On top of this, they decided it was a good idea to use some clever logic to divide the master clock by four when feeding the Signetics S2621 Universal Sync Generator (USG) that generates video timings, but to divide it by three to generate the pixel clock feeding the rest of the video hardware. This lets them get one third more horizontal resolution than the Signetics chips are designed to work with. They need additional logic to line up the pixel clock with the end of horizontal blanking, because the number of pixels in a line isn’t divisible by three, and some more logic for delaying the start of the visible portion of each frame and cutting it off early because they wanted less vertical resolution than the Signetics chips are designed for. It uses an GRBGRBGR colour scheme where the most significant bits are are in the middle of the byte and the missing least significant blue bit effectively always, so it can’t produce black, only a dark blue Was this design effort worth it? I guess they must’ve made some money off the Midway license at least.

Anyway, this game has never worked properly in MAME. It’s always been missing the area and point effects, the colours have always been completely wrong, and the mixing between layers hasn’t properly either. And that’s done inside the PLA. The PLA has 48 internal logic variables, each of which can be programmed to recognise an arbitrary combination of levels on the 16 input line. Each of the internal variables can drive any combination of the eight output lines. The outputs can be configured to be inverting or non-inverting.

In theory this sounds like a job for a ROM, so why use a PLA instead? Well a ROM with 16 input bits and eight output bits would need 64kB of space. Such a ROM would likely have been prohibitively expensive when this game was produced. I mean, its program ROMs are only 2kB each, so there’s no way they’d be sourcing a ROM 32 times that size just for video mixing. The PLA maps the same number of inputs to the same number of outputs with just 1,928 bit of storage, or a little less than one of the program ROMs. It can’t produce absolutely any arbitrary input to output mapping, but it’s more than enough for this application. In fact, it turns out they didn’t even need to use all the space in the PLA.

Read the rest of the post if you want to know more about the process of decoding the PLA bitstream and examining its contents.

The hookup

By examining the schematics, we can see that the PLA’s inputs are hooked up like this:

Bit Name Description
0 NAV0 Sprite bit 0
1 NAV1 Sprite bit 1
2 CLR0 Sprite colour bit 0
3 CLR1 Sprite colour bit 0
4 LUM Sprite luminance (brightness)
5 C1* Combined PVI colour bit 1 (red) gated with blanking (active low)
6 C2* Combined PVI colour bit 2 (green) gated with blanking (active low)
7 C3* Combined PVI colour bit 3 (blue) gated with blanking (active low)
8 BKR Background tilemap red
9 BKG Background tilemap green
10 BKB Background tilemap blue
11 SHELL Shell point
12 EFF1 Effect 1 area
13 EFF2 Effect 2 area
14 COLEFF0 Area effect colour bit 0
15 COLEFF1 Area effect colour bit 1

CLR0, CLR1, LUM, COLEFF1 and COLEFF2 are relatively static bits that the game program sets by writing to I/O ports. The rest of the bits are generated dynamically based on video register and RAM contents.

Streams of bits

The PLA bitstream consists of 48 sets of 40 bits for each of the internal logic variables, followed by a final eight bits specifying which of the outputs should be active low. Each internal variable has two bits controlling how it’s affected by each of the 16 inputs (32 bits total), followed by eight bits specifying which outputs it shouldn’t activate (32+8 makes for a total of 40). If a pair of input bits are both zero (00), it’s impossible for that logic variable to be activated. This is an easy way to spot unused variables, and it’s the default state in an unprogrammed chip. If only the first bit in a pair is set (10), the corresponding input line must be high in order for the variable to be activated. If only the second of the bits is set (01), the corresponding input line must be low in order for the variable to be activated. If both bits in a pair are set (11), the input may be activated irrespective of the state of the corresponding input line (often called a “don’t care” condition).

Finally, MAME expects the bitstream in a file containing a 32-bit integer specifying the total number bits, followed by the bits themselves, packed into bytes least-significant bit first. Armed with this knowledge, we can write a some code that transforms converts the bitstream to an intermediate representation and displays it as C code:

UINT8 const *bitstream = memregion("plds")->base() + 4;
UINT32 products[48];
UINT8 sums[48];
for (unsigned term = 0; 48 > term; term++)
{
    products[term] = 0;
    for (unsigned byte = 0; 4 > byte; byte++)
    {
        UINT8 bits = *bitstream++;
        for (unsigned bit = 0; 4 > bit; bit++, bits >>= 2)
        {
            products[term] >>= 1;
            if (bits & 0x01) products[term] |= 0x80000000;
            if (bits & 0x02) products[term] |= 0x00008000;
        }
    }
    sums[term] = ~*bitstream++;
    if (PLA_DEBUG)
    {
        UINT32 const sensitive = ((products[term] >> 16) ^ products[term]) & 0x0000ffff;
        UINT32 const required = ~products[term] & sensitive & 0x0000ffff;
        UINT32 const inactive = ~((products[term] >> 16) | products[term]) & 0x0000ffff;
        printf("if (!0x%04x && ((x & 0x%04x) == 0x%04x)) y |= %02x; /* %u */\n", inactive, sensitive, required, sums[term]);
    }
}
UINT8 const mask = *bitstream;
if (PLA_DEBUG) printf("y ^= %02x;\n", mask);

Equations

When we feed this the PLA bitstream dumped from a Lazarian board, we get the following output (blank lines added for readability:

if (!0x0000 && ((x & 0x001f) == 0x0001)) y |= 01; /* 0 */
if (!0x0000 && ((x & 0x001f) == 0x0002)) y |= 03; /* 1 */
if (!0x0000 && ((x & 0x001f) == 0x0003)) y |= 04; /* 2 */
if (!0x0000 && ((x & 0x001f) == 0x0011)) y |= 08; /* 3 */
if (!0x0000 && ((x & 0x001f) == 0x0012)) y |= 18; /* 4 */
if (!0x0000 && ((x & 0x001f) == 0x0013)) y |= 20; /* 5 */
if (!0x0000 && ((x & 0x001f) == 0x0005)) y |= 07; /* 6 */
if (!0x0000 && ((x & 0x001f) == 0x0006)) y |= 03; /* 7 */
if (!0x0000 && ((x & 0x001f) == 0x0007)) y |= 04; /* 8 */
if (!0x0000 && ((x & 0x001f) == 0x0015)) y |= 38; /* 9 */
if (!0x0000 && ((x & 0x001f) == 0x0016)) y |= 18; /* 10 */
if (!0x0000 && ((x & 0x001f) == 0x0017)) y |= 20; /* 11 */
if (!0x0000 && ((x & 0x001f) == 0x0009)) y |= 05; /* 12 */
if (!0x0000 && ((x & 0x001f) == 0x000a)) y |= 04; /* 13 */
if (!0x0000 && ((x & 0x001f) == 0x000b)) y |= 06; /* 14 */
if (!0x0000 && ((x & 0x001f) == 0x0019)) y |= 28; /* 15 */
if (!0x0000 && ((x & 0x001f) == 0x001a)) y |= 20; /* 16 */
if (!0x0000 && ((x & 0x001f) == 0x001b)) y |= 30; /* 17 */
if (!0x0000 && ((x & 0x001f) == 0x000d)) y |= 07; /* 18 */
if (!0x0000 && ((x & 0x001f) == 0x000e)) y |= 04; /* 19 */
if (!0x0000 && ((x & 0x001f) == 0x000f)) y |= 04; /* 20 */
if (!0x0000 && ((x & 0x001f) == 0x001d)) y |= 38; /* 21 */
if (!0x0000 && ((x & 0x001f) == 0x001e)) y |= 20; /* 22 */
if (!0x0000 && ((x & 0x001f) == 0x001f)) y |= 20; /* 23 */

if (!0x0000 && ((x & 0x0023) == 0x0000)) y |= 01; /* 24 */
if (!0x0000 && ((x & 0x0043) == 0x0000)) y |= 02; /* 25 */
if (!0x0000 && ((x & 0x0083) == 0x0000)) y |= 04; /* 26 */
if (!0x0000 && ((x & 0x29e3) == 0x01e0)) y |= 01; /* 27 */
if (!0x0000 && ((x & 0x2ae3) == 0x02e0)) y |= 02; /* 28 */
if (!0x0000 && ((x & 0x2ce3) == 0x04e0)) y |= 04; /* 29 */

if (!0x0000 && ((x & 0x08e3) == 0x08e0)) y |= 38; /* 30 */
if (!0x0000 && ((x & 0xffe3) == 0x10e0)) y |= 84; /* 31 */
if (!0x0000 && ((x & 0xffe3) == 0x50e0)) y |= 84; /* 32 */

if (!0x0000 && ((x & 0x0000) == 0x0000)) y |= 00; /* 33 */

if (!0x0000 && ((x & 0xffe3) == 0xd0e0)) y |= 04; /* 34 */
if (!0x0000 && ((x & 0xe0e3) == 0x20e0)) y |= 80; /* 35 */
if (!0x0000 && ((x & 0xe0e3) == 0x60e0)) y |= 40; /* 36 */
if (!0x0000 && ((x & 0xe0e3) == 0xa0e0)) y |= c0; /* 37 */
if (!0x0000 && ((x & 0xe0e3) == 0xe0e0)) y |= c0; /* 38 */
if (!0x0000 && ((x & 0xffe3) == 0x90e0)) y |= 40; /* 39 */

if (!0xffff && ((x & 0x0000) == 0x0000)) y |= ff; /* 40 */
if (!0xffff && ((x & 0x0000) == 0x0000)) y |= ff; /* 41 */
if (!0xffff && ((x & 0x0000) == 0x0000)) y |= ff; /* 42 */
if (!0xffff && ((x & 0x0000) == 0x0000)) y |= ff; /* 43 */
if (!0xffff && ((x & 0x0000) == 0x0000)) y |= ff; /* 44 */
if (!0xffff && ((x & 0x0000) == 0x0000)) y |= ff; /* 45 */
if (!0xffff && ((x & 0x0000) == 0x0000)) y |= ff; /* 46 */
if (!0xffff && ((x & 0x0000) == 0x0000)) y |= ff; /* 47 */

y ^= 00;

Fortunately this PLA seems to have been programmed manually and not using an automatic logic minimiser. Can you spot the patterns yet? The last eight terms (40–47) have been left in their virgin, unprogrammed states and hence can’t affect the output. Term 33 has been programmed to have no effect on the output, so they’ve used 81.25% of the chip. They could’ve gotten clever and added more logic if they really wanted to, but I guess with everything else they did on the board they were out of tricks by this point. But anyway, on to the equations.

Sprite mapping

The first 24 terms (0–23) only depend on the sprite-related bits, so it’s a good bet they control sprite colours. Let’s make a table of the mappings these terms produce. For convenience we’ll treat NAV0/NAV1 and CLR0/CLR1 as two-bit numbers:

NAV 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3
CLR 0 0 0 0 0 0 1 1 1 1 1 1 2 2 2 2 2 2 3 3 3 3 3 3
LUM 0 0 0 1 1 1 0 0 0 1 1 1 0 0 0 1 1 1 0 0 0 1 1 1
01 03 04 08 18 20 07 03 04 38 18 20 05 04 06 28 20 30 07 04 04 38 20 20

Now the pattern should be really clear. Sprite pixel value 0 produce no output (making it dark or transparent), while the other three values are each mapped to colour depending on the value of CLR. Setting LUM shifts the colour left by 3 bits into the most significant bits of the colour output. This can be summarised in a table with NAV in rows and CLR in columns:

0 1 2 3
0 00/00 00/00 00/00 00/00
1 01/08 07/38 05/28 07/38
2 03/18 03/18 04/20 04/20
3 04/20 04/20 06/30 04/20

Or we could even use human-readable colour names since we know the output format is GRBGRBGR (there’s a distinct lack of green in this palette):

0 1 2 3
0 dark dark dark dark
1 red white magenta white
2 yellow yellow blue blue
3 blue blue cyan blue

So they used half the PLA to handle sprite colour mapping.

Backgrounds and PVIs

The next set of six terms look straightforward enough, let’s make a table and see what they produce (once again we’re treating NAV0/NAV1 as a two-bit number):

NAV 0 0 0 0 0 0
C1* 0 1 1 1
C2* 0 1 1 1
C3* 0 1 1 1
BKR 1
BKG 1
BKB 1
SHELL 0 0 0
EFF2 0 0 0
01 02 04 01 02 04

This shows that the PVIs’ outputs are mapped directly onto the low red, green and blue bits of the output (not actually the LSBs, they’re at the top of the output byte). The TTL-generated sprite has priority over both the PVI outputs and the background tilemap; additionally, the PVIs, the shot point, and area effect 2 also have priority over the background tilemap (since the OBJ/SCR lines from the PVIs are not considered, the PVIs don’t take priority with black object pixels, only with coloured pixels).

Effects

The rest of the terms relate to effects. We can look at them all at once (NAV and COLEFF are two-bit numbers):

NAV 0 0 0 0 0 0 0 0 0
C1* 1 1 1 1 1 1 1 1 1
C2* 1 1 1 1 1 1 1 1 1
C3* 1 1 1 1 1 1 1 1 1
BKR 0 0 0 0
BKG 0 0 0 0
BKB 0 0 0 0
SHELL 1 0 0 0 0
EFF1 1 1 1 1
EFF2 0 0 0 1 1 1 1 0
COLEFF 0 1 3 0 1 2 3 2
38 84 84 04 80 40 c0 c0 40

So what does this tell us? Lots of things! You might notice that the second and third columns can easily be reduced to a single term, and so could the seventh and eighth columns, clearly an oversight but not important since there’s space to spare in the PLA. More seriously, we can work out priorities from the rows:

  • Sprite and PVI output always have priority over these effects
  • Background, shell and effect 2 have priority over effect 1
  • Remember that shell and effect 2 are mutually exclusive, so there’s no priority between them

Now we can look at the output each effect actually produces:

  • The first column says the shell sets the MSBs of all three colours, making it light grey or “dark white”.
  • Effect 1 sets the background colour (depending on COLEFF) to medium blue (3), dark magenta (2), or just on the cyan side of medium blue (1 or 0).
  • Effect 2 sets the background colour (depending on COLEFF) to dark cyan (0), dark magenta (1), or dark grey (2 or 3).

Emulation

So how do we get this information into MAME? Well we could take what we’ve learned and write C++ code to draw the graphics in the appropriate sequence, but that would have several disadvantages. Firstly it’s not how the real machine works — the real machine works by composoing a 16-bit value and feeding it through the PLA to get the output colour. Secondly, implementing it in C++ wouldn’t allow someone to try a different PLA dump to see what effects is has on gameplay. Thirdly, someone couldn’t develop their own PLA program to drop into MAME to play with for homebrew purposes. Instead, we use some more code to turn our intermediate representation of the PLA terms into a 64kB mapping table:

for (UINT32 inp = 0x0000; 0xffff >= inp; inp++)
{
    m_mixing_table[inp] = 0;
    for (unsigned term = 0; 48 > term; term++)
    {
        if (!~(inp | (~inp << 16) | products[term]))
            m_mixing_table[inp] |= sums[term];
    }
    m_mixing_table[inp] ^= mask;
}

Then we render scanlines into a 16-bit bitmap and pass it through this table to convert it to colours for MAME’s video output. (Yes, it’s possible to run the PLA equations on the bitmap data directly from the compact intermediate representation. However it’s slower as it involves several logical operations, and 64kB is a small amount of memory these days for a cached lookup table. However, CPUs are getting pretty fast, and cache miss latency keeps getting worse, so perhaps sticking with the intermediate form wouldn’t be such a bad idea.)

]]>
https://rants.vastheman.com/2015/12/15/lasers/feed/ 0