Piero V.

NetStylus

Preamble

Recently I started digital sculpting, and I immediately realized that the mouse is not the best tool for this scope. As any tutorial will tell you, a drawing tablet will make you much faster and much more precise.

I do not have one, but I have a Microsoft Surface Pro and a Surface Pen. However, it is the base, not-so-powerful model: it has just a Core M3 and 4GB of RAM. It was enough to study at University, but, sadly, I cannot even think of running a 3D editor in it.

Initially, I tried Weylus, a program that allows you to control a machine (my Linux desktop, in my case) through a web browser from any device with a stylus. Being web-based, it works on any device, including iPads, and it even mirrors the screen.

However, it did not play well with the barrel button of my pen. And that button is critical for a lot of workflows.

Therefore, I decided to write my own software to do so: NetStylus.

The Win32 API for tablets

I discovered that Microsoft has liked pen input methods for years: they started the Tablet PC thing with Windows XP, before 2005!

They have several APIs and functionality, but we are interested in the one that sits at the beginning of the chain: the Real-Time Stylus interface.

With it, you can receive the pen events, either in the main event loop (asynchronous plugin) or immediately after their generation, but in another thread (synchronous plugin). The latter has lower latency but should process the data as fast as possible, while the former is less performance-critical but may introduce some delays. Both may group several events together.

Having the lowest delay between the input and the output is the objective of any input method, so I chose the synchronous option.

From a technical point of view, an RTS plugin is a class that inherits from IStylusSyncPlugin. This interface contains a method for each type of event a stylus supports. All of them are pure virtual and must be implemented. At runtime, the plugin tells which ones are actually used with the DataInterest method.

Getting the data

For our application, we will focus mostly on Packets and InAirPackets, which are called when new packets are ready to be parsed, as their name states. The former is called when the stylus is touching the screen, the latter when the pen is just hovering it.

Actually, the packets can tell whether the pen was touching the screen. Therefore, I implemented both Packets and InAirPackets in the same way. I think the division has been created to avoid parsing the in-air packets and reduce processing time if possible.

Instead of having some kind of struct, the packet data is contained in an array of LONG. This array follows the same order of the properties returned by IRealTimeStylus::GetPacketDescriptionData.

The property meaning must be deduced by checking its GUID. I decided to keep a copy of these indices on a std::unordered_map rather than doing these queries for each packet.

The properties we look for are x, y, (normal) pressure, tilt, and status. The latter is a bitfield that tells whether:

  • the point was touching the screen;
  • the pen was reversed (i.e., the eraser was used);
  • the barrel-button of the pen was pressed.

So, when packets arrive, for each one:

  1. we get the indices of the properties we need by checking the ID of the tablet;
  2. we rearrange the values as we prefer;
  3. we send the data through the network.

Both Packets and InAirPackets can change the packet contents, but this is not necessary for our scope.

Pen units, millimeters, pixels, and DPI

The pen position is in multiples of 10µm (i.e., each mm is 100 units). However, this is not enough: we need to know the bounds of x and y, too.

That seems easy: you can know the window contents size with GetClientRect. GetDeviceCaps(dc, HORZSIZE) and GetDeviceCaps(dc, VERTSIZE) tell you the resolution in DPI of the window, allowing you to convert the window size from pixels to inches.

While this is the size of the window, it is not the maximum of x and y!

Windows has a property called pixels per logical inch, which is used to scale the pen position, amongst the others!

It was a bit difficult to get this right; I had to go by trial and error. Eventually, I decided to remove this additional scaling to make data correspond to the real-world length.

These bounds may change, so I recompute them during the WM_SIZE and WM_MOVE events. The latter handles movements of the window to another screen with different DPIs, although this should not happen in our use-case. IStylusSyncPlugin::UpdateMapping is called whenever there are similar changes, so I recompute the bounds when this method is called, too.

And that’s it, for our stylus plugin.

Some notes about the implementation

This was the first time I wrote a win32 application. And I must say I do not like the C++ style I had to use. No wonder there are many C++-haters: they must think of this when you say «C++» to them.

Also, I would not have survived the COM mess without the help of an article about this. Kudos after 8 years, Anders Ekermo.

I think we could say that the stylus part is okay, after all. But the window management part is not, really. Blame me for not using resource files and create the whole GUI with the createWindow function. But the inability to do something much better only with code is quite annoying. I used to complain about Qt, but plain win32 is just worse in any aspect.

Also, I used the singleton pattern, just for the WindowProc. While it is possible to associate a custom pointer to a window, some events are processed before it was available.

I wanted to do lots of cool stuff, like using the dark theme, but I could not do the basic ones either, so I gave up with the advanced ones.

The network communication

For the network, I assumed optimal conditions: my desktop is connected through Ethernet, and it is in the same network as the Surface.

I did not want to invent anything complicated, so I decided to just relay the data the stylus receives rearranged in a precise order.

In addition to that, I send each time the maximum x and maximum y and the maximum pressure.

My original idea was to allow the user to resize the window if needed, but eventually, that did not work with evdev. Still, 40 or 52 bytes is not that different, but it spares me implementing a handshake procedure, client tracking for the server, and a receive loop/thread on the client.

The communication uses UDP, so I also added a sequence number, although I expect it not to be necessary for a local network. The server drops packets with a sequence number lower than the last received one, unless the difference is above a certain threshold, to handle the change of client.

It is so trivial and might have weaknesses, but hey, it works, for the moment. I am still at a proof of concept stage, so, for now, I will not improve this aspect (see the Conclusions section).

evdev

So far, I have been talking about the client, which does not make any assumptions about the server.

I desired it to work with my Linux PC, so I implemented a server based on evdev, a generic input interface for Linux and FreeBSD, and libevdev.

As a first step, we create a new device and assign a series of event codes, grouped by type. We start with EV_ABS for all the data we have as an absolute value: x and y position, pressure, and tilt. For this type of event, libevdev_enable_event_code needs the pointer of an input_absinfo struct. It tells the minimum, maximum, and resolution of the value. The latter states which units are being used: units per mm, for position, and units per radian, for the tilt. Optionally, we could pass a fuzz value to discard events whose change is below this threshold.

We also enable some events of the EV_KEY type. The pen button is seen as a key (BTN_STYLUS), and this makes sense. But that’s not all! Even using the pen or the eraser is seen as a button (BTN_TOOL_PEN and BTN_TOOL_RUBBER, respectively). Finally, we add a button to tell whether the stylus is touching the screen (BTN_TOUCH).

Now we are ready to convert the received network packets into input events.

So far, so good, but I cheated a little bit. In reality, there is a problem: evdev has some functions to update the maximum and minimum values of a code, but I could not make them work.

There are a couple of possible workarounds:

  1. wait for the first packet and only then create the device;
  2. use a default range, and then rescale all the received values if it changes.

I chose the first one in case some application needs real-world values.

So the order is slightly different from what I wrote before:

  1. start listening on UDP;
  2. wait for the first packet;
  3. create the device;
  4. continue listening for packets indefinitely.

I also implemented a way to handle SIGINT to perform a clean shutdown, and I added a timeout for socket operations for the same purpose.

However, it is impossible to pass a custom pointer to the handler signal, so I used a global flag to tell whether the program should stop.

Please notice that to create the evdev device, you need to be able to write on the /dev/uinput file, e.g., run as root the following command:

chmod 666 /dev/uinput

Conclusions

For some time now, I have been using my Surface to watch videos and read documents or manga. Now I can use it and its pen again, and I have lots of fun with sculpting.

However, this is not a perfect setup: the Surface pen is a bit heavy, and sometimes I have difficulties in pressing its button or finding the most comfortable position that allows me to do so. In many cases, I ended with the pen making pressure between my thumb and index finger, and after some time, it was a bit painful.

Finally, the friction of the Surface glass is too little. I have always considered this as a small defect of this device. I noticed this also while taking notes at University.

But for the time being, I will continue using this system. Knowing me, I might get bored at sculpting soon 😅️.

If you are interested in using the project, or its source code, you can find it on my GitHub.