Catégorie : Computer science

ROLI Airwave: another journey of technical issues
I purchased this new Airwave device from ROLI on November 2024. I knew it would take a couple of months to get the device, similar to my RISE 2, but didn’t think it would take more than a year. It took more time than expected to fully test the device and there were some manufacturing hurdles throughout the process. At least, ROLI provided regular updates to customers and finally shipped the units.

Friday, November 28th 2025, I finally got the device. Given the time it took to get this device ready, I was hoping for a flawless setup and use. I was deeply mistaken. It took my whole Friday evening to make this work very partially, only with the standalone player, no recording possible unless I switch to the two-computer setup I don’t have room to put up and I want to avoid for years. The next day, I found a solution to make the VST work. Nevertheless, the calibration never completes and the device is pretty limited in the end, although working from an hardware point of view. It is however perfectly possible the Airwave Player software will evolve into a full synth and calibration bugs will get fixed officially eventually.

What is the ROLI Airwave supposed to do?

The Airwave tracks hand movements using a camera and translates them into MIDI signals. This adds to the 5 dimensions of air on top of the dimensions of touch. The device detects when hands are raised, bent, moved left or right or twisted. A dedicated software synthesizer, Airwave Player, takes into account the new 5 dimensions of air and the existing 5 dimensions of touch, adding more richness and variations to sound. The Airwave Player has a fixed number of presents, no editor. However, the Airwave produces MIDI CC. With a bit of work, these CC could be mapped to parameters in any other software synthesizers.

The device looks like on the following picture:

The Airwave is plugged to computer using its USB C port, with the provided C to C cable or an C to A cable. There is another USB C port to power the device.

There is a connector for an expression pedal as well as an audio jack, not sure what the jack is used for.

A MIDI controller, preferably from ROLI, e.g., a Seaboard Block, can be either connected independently to the computer through another USB port, or plugged in one of the front ports of the Airwave using the provided magnetic connectors.

A difficult setup

When I plugged my new Airwave device, ROLI Connect could not see it, making it impossible to register it. Registration is necessary to get access, from ROLI Connect, to the ROLI Airwave Control and ROLI Airwave Player software. After I passed this hurdle, I was not getting any hand tracking from Airwave Player because I had to create the virtual MIDI port I didn’t know about. This second road block behind, hand tracking was working but I faced issues with calibration; either it is broken, either Airwave Player just doesn’t report calibration is done, letting the user guess it is completed or not. After all these efforts, a wasted Friday evening with a lot of frustration, I was able to try my new device with standalone ROLI Airwave Player, but when trying the VST in my DAW, no hand tracking occurred. I had to find another trick, requiring a basic understanding on how the device works, to get past this.

Device registration not working

When I connected my new ROLI Airwave, I heard the usual sound showing Windows detected the device, but it was invisible from ROLI Connect. I tried both USB ports, tried with a USB A to C cable instead of the provided C to C cable, to no avail. I started to think the device was dead on arrival but at least I got confirmation from Device Manager that the device was detected:

The Ultraleap device is the Airwave’s camera. The ROLI Airwave Pedal is a MIDI source mapped with the pedal connector on the device. The presence of the Seaboard Block shows that USB was properly passing from my Seaboard Block M through my Airwave. Despite all this, nothing in ROLI Connect!

After a while, I found that once again, ROLI Dashboard was broken! Pretty much each time ROLI Connect auto-updates, it kills the ROLI Hardware Driver service instead of leaving it alone or updating it properly. It’s been more than a year this bug exists, I’m not getting why it hasn’t been fixed and how non-techical users are able to deal with the issue. Once again, I had to manually reinstall the ROLI Hardware Driver so ROLI Dashboard started to recognize my devices again.

After this step, ROLI Connect was able to see and register my new Airwave device. This unlocked ROLI Airwave Control and ROLI Airwave Player which I installed.

No hand tracking

After I installed ROLI Airwave Player, I started it. I was able to play notes and experience the 5 dimensions of touch. However, moving my hands in front of the camera had no effect. Searching the Web uncovered the possibility of a broken camera. I started to think I would need to return the device to the manufacturer, which annoyed me a lot.

The problem was missing virtual MIDI port named ROLI Airwave Expression. I already had loopMIDI installed. I started the tool and created the port.

I was puzzled about how this works. The ROLI Airwave Service, which I checked was running, is hard-coded to write to that MIDI port. It doesn’t create it, assuming it’s already there. The user has to install some tool like loopMIDI and create the port.

After that, I was able to try calibrating the device. This is done by starting Airwave Control and clicking on Recalibrate.

This starts the Airwave Player.

I clicked on my Seaboard Block M device and was asked to press on the highlighted keys to calibrate.

The instructions are quite misleading, asking to press both keys with my index fingers, while three keys are lit up: the C, the F and the G! After I pressed the leftmost C key and the rightmost F, tried G, tried F and G, the screen changed to the following.

Trying to press keys closer to the middle gives no result. Calibration never completes. Fortunately, the device reacts to hand movement, so at least its camera is not broken.

Some research shows that there was a bug with calibration in prior versions of Airwave Player. I double checked (checked multiple times in fact) I was running latest version, to no avail. Some users contacted ROLI and got special limited availability betas of Airwave Player for which calibration works.

This process is totally inefficient, for both users and ROLI staff. Users will try calibrating, fail, some will just return the device thinking it is defective, others will report the issue and get the special beta then go on. ROLI staff, on the other hand, will bef flooded with the same request, for multiple users, to get the special beta.

Question: why not release that beta? If calibration doesn’t complete out of the box, this will confuse many users and could cause unnecessary returns.

Another possibility is that calibration never completes by design and the user has to guess this is calibrated. This is in my opinion as poor as having users request a special beta individually.

I hope this issue will get fixed over time or the UI will be clarified. This device has potential, but the shortcomings need to be addressed first.

At least, I was able to start playing with my new device. It was not just dead on arrival at this point.

However, I don’t want to just play with the device. I want to record myself. Then enters the DAW: Ableton Live.

As I was expecting, ROLI provided a VST for Airwave.

I added the VST to a MIDI track and configured my input as the Seaboard. I tried with the ROLI Airwave Pedal and ROLI Airwave Expression inputs to no avail. Only the Seaboard MIDI input was flowing MIDI into the VST, generating sound.

As soon as the VST produces sound, it is possible to create an audio track and configure the input of the track to be the output of the MIDI track+VST. This technique records the audio of the VST as is and can reproduce the audio reliably.

Why not just record the MIDI notes? This is because the VST has a state, i.e., the current preset being selected. If you record just the MIDI, you have to take note of which preset is selected and make sure that preset is the same when you replay your track, otherwise you get different audio.

All of this worked, same way as with any other VST such as ROLI Studio Player, Arturia’s Analog Lab V, Sampleson Scaper, etc. However, no hand tracking worked!

Out of the box, this only works with the standalone Airwave Player, not the VST. Solution? Once again, have a two computer setup. I don’t have enough room on my desk to put these two computers and I find such a setup cumbersome.

So back to square 1, RMA it would be!

The two computer setup I don’t want

It’s been several times I’m wondering about that, mainly because of VST instability. If a VST has a bug, it can easily crash the whole DAW, completely ruining recording sessions. For example, Scaper from Sampleson crashes the whole DAW if I change preset too often or too quickly. ROLI Studio Player has a sad history of crashes. Some DAWs like Reaper sandbox VSTs, most don’t. Implementing VST sandboxing would be a major change requiring a lot of refactoring. If VSTs are too unstable and I don’t want to switch DAW, the only remaining solution is to record with one computer and run the plugin on another machine.
```
flowchart
   Controller[MIDI Controller]
   Computer1[Computer with player]
   DAC[Digital to analog interface]
   ADC[Analog to digital interface]
   Computer2[Computer with DAW]
   

   Controller --> Computer1
   Computer1 --> DAC
   DAC -->|Jack cable|ADC
   ADC --> Computer2
```
This setup, while shielding the DAW from VST crashes and working around when only standalone player is available, is quite cumbersome. Having to turn digital audio to analog and back to digital can cause quality reduction unless high end audio interfaces are used. It would be far better to move audio digitally from the two computers, S/PDIF may allow this, but this is not fully reliable.

The single computer recording trick

I fortunately found a way to record with hand tracking. I needed to come up with an understanding of how the Airwave works (see next section) to figure this out. Web search or AI would give false information, including the VST cannot play audio, only Airwave Player can track hand movement, etc.

Main idea is to have two MIDI tracks: the main track taking MIDI from Seaboard device and outputting to Airwave VST, a secondary MIDI track taking MIDI from ROLI Airwave Expression (yes, the virtual MIDI device we created with loopMIDI) and outputting to the primary track. In other words, we need to combine MIDI from two different sources and feed that combined MIDI into the VST.

More specifically, you first need to ensure MPE is enabled for both MIDI ports and the VST. For this, go to Live’s Preferences, MIDI section, and enable MPE for the MIDI inputs.

Then create a new MIDI track, adding the Airwave VST to it. Click on the … icon of the VST and turn on MPE.

Set the MIDI device and do one of the following: arm the track or set monitoring to In. If you arm the track, you will be able to record the MIDI, but if you play a lot with the dimensions of touch and air, this generates over-cluttered MIDI tracks that have a tendency to slow down and even crash Live. Not arming the track is a workaround but you then need to turn on monitoring so MIDI flows through.

Now create a second MIDI track. Set the input to Airwave Expression and output MIDI to the first track. This time, just set the monitoring to In. All MIDI will land in first track so if you want to capture it, just arm the first track.

Almost there! Now create the audio track. Set its input as the output of the VST, and arm it.

You can validate the setup by checking the gauges besides All Channels. If you press keys on your Seaboard, you get activity on first MIDI track. If you move your hands in front of the camera, you get activity in the second MIDI track. If audio works, you get activity in first track and the audio track. Now you can record!

You should save this non-trivial setup in a template for future use otherwise you’ll fight with Live each time you start a new session.

Voila. This is how I record at the moment.

How does it work?

Following diagram shows the architecture as far as I could guess it. The presence of all these moving parts explains why issues like the ones I experienced can arise.
```
flowchart LR
   subgraph ROLI Airwave
      Camera
      MIDI[MIDI interface]
      AirwaveUSB[USB interface]
   end
   AirwaveService[ROLI Airwave Service]
   HardwareDriver[ROLI Hardware Driver]
   VirtualMIDIPort["ROLI Airwave Expression (loopMIDI)"]
   VirtualMIDIPort2["Virtual MIDI port (multi client)"]
   AirwavePlayer[ROLI Airwave Player]
   Dashboard[ROLI Dashboard]
   Control[ROLI Airwave Control]

   MIDI --> HardwareDriver
   HardwareDriver --> VirtualMIDIPort2
   Camera & MIDI --> AirwaveUSB
   Camera --> AirwaveService
   AirwaveService --> VirtualMIDIPort
   VirtualMIDIPort & VirtualMIDIPort2 --> AirwavePlayer
   VirtualMIDIPort --> Control
   VirtualMIDIPort2 --> Dashboard
```
The ROLI Airwave device interacts with a computer using a USB C port. The device has two ports: one used for power through a provided adapter, another used for data. The USB data interface exposes at least two components: a webcam-style camera and a MIDI interface. MIDI can come from an expression pedal that can be plugged onto the device and from Seaboard if plugged to one of the front ports.

A new system service called ROLI Airwave Service reads from the camera and turns image into MIDI CC, most probably using some form of neural network. The fact that CPU fan runs all the time when device is plugged in lets me believe the processing is CPU-based, while it could make use of my NVIDIA RTX GPU to reduce CPU usage. The MIDI CC from hand tracking is sent into a virtual MIDI port named ROLI Airwave Expression. ROLI doesn’t provide its own MIDI driver for this, rather letting the user install loopMIDI himself and create the port.

On the other hand, MIDI from the BLOCK devices (Seaboard, Lightpad, etc.) is processed by another service called ROLI Hardware Driver. The driver sends the MIDI data to another virtual MIDI port created by a proprietary multi-client driver.

This complexity is the result of a limitation in Windows, where any MIDI connection is exclusive. If a DAW reads from a MIDI device, another DAW or dashboard software cannot read at the same time. The ROLI Airwave Expression virtual port is read by both ROLI Airwave Player (or a DAW+VST setup) and the ROLI Airwave Control supposed to participate in the calibration process. The BLOCK virtual MIDI port is read from players (Airwave Player, Studio Player, DAW+VST) and ROLI Dashboard needed to perform settings and firmware updates on the devices. The architecture also allows to consolidate BLOCK devices into one virtual device used by Studio Player.
7 décembre 2025
How to transfer ROLI and Ableton Live settings to a new machine?
Reinstalling Windows or switching computer doesn’t mean loosing all your Ableton Live clips, templates, etc. You should not have to start from scratch in VSTs such as ROLI Studio Player or Arturia’s Analog Lab. Even worse is loosing not only your favorites but also custom patches. All of this is stored in files that can be located, backed up and copied from one system to another. Problem is to find these files.

This post gathers notes about locations I found over time, so I don’t have to search these again and again, and hopefully this can be useful to others.

Ableton Live User’s library

From the File Explorer, click Documents.

There should be a folder named Ableton. Here, there is a subfolder named User Library. The final path looks something like:

C:\users\<your user name>\Documents\Ableton\User Library

This directory has several useful subdirectories worth copying over or backing up:
- Templates: Ableton Live templates used to start new Live sets without having, each time, to add tracks for your audio inputs or software synthesizers.
- Clips: audio clips you want to be accessible from any new project
- Samples: sample files, such as loops you recorded some time ago.
Note that if you are using Onedrive or Dropbox, you should disable backing up of Documents folder. If you turn on backup of Documents folder, Live may search for the user library in the Onedrive directory instead and that could affect its performance.

ROLI Studio Player favorites

These are the hearts you can associated with presets you like.

Press Windows key and type %AppData%. Go to ROLI subdirectory, Shared, then copy the favourites.xml file. The complete path looks something like

C:\users\<your user name>\AppData\Roaming\ROLI\Shared\favourites.xml

ROLI presets

Presets are located elsewhere:
- Equator 2: c:\users\<user name>\Documents\ROLI\Equator2: Playlists and Presets directories are particularly interesting
- Cypher2: c:\users\<user name>\Documents\FXpansion\Cypher2: Favourites and Presets are especially useful
- Strobe2: c:\users\<user name>\Documents\FXpansion\Strobe2: again Favourites and Presets
Arturia’s V Collection presets

These are stored in C:\ProgramData\Arturia\Presets. For each product, there is a directory in there, with a Factory and a User subdirectory.
19 octobre 2024
The broken ROLI dashboard
ROLI Dashboard is used to control certain aspects of ROLI instruments such as the Seaboard and Lightpad. In particular, it can be used to change the Lightpad’s mode, going from the drum pad to the RISE controller. Unfortunately, sometimes, apparently more and more often, the Dashboard won’t work at all, showing the following error.

Oh no! It looks like the ROLI Hardware Driver isn’t running.

This is even worse. Start the ROLI Studio Player, either standalone or as a plugin, and you’ll see it cannot configure the Lightpad to control studio parameters.

This problem often appears after ROLI Connect performed an update. During the update process, the ROLI Hardware Driver is uninstalled and, for some unknown reason, not reinstalled properly. This post shows how this can be fixed. I started to have this issue after upgrading to Windows 11, but my friend with Windows 10 also experienced the problem while things were working fine before.

This post shows how I fixed this issue on three different systems so far. This may not be a universal solution. Other problems can happen with ROLI software, with any brand of software in fact, and I cannot be sure all readers will face the same issue. It is thus important to understand what you are doing. Do not just apply this blindly as a recipe.

Is the ROLI Hardware still there?

First step is to verify that ROLI Hardware Driver is still present as a Windows service. For this, press the Windows key to get the Start menu and type Services. Then in the list of services, search for ROLI Hardware driver. You may (or may not) find the service, as shown on the following image.

Windows services, showing ROLI Hardware driver is there

If the service is there, try to right-click its name and select Start. If the service doesn’t start, you unfortunately got a new bug that is not covered by this post. If the service is not there, you are at the right place and can try to follow the instructions below. I found no complete solution for this problem and had to do research to come up with something that fully solves the issue… until the next update.

Note however that given the weirdness of the issue, it could well happen that following instructions won’t work or won’t apply. You may then have to contact ROLI’s technical support, and hope for the best.

Fixing the ROLI Hardware driver

First thing is to make sure ROLI Hardware driver runs. First verify that ROLI Connect is still working. If it doesn’t work, reinstall it. The ROLI Dashboard should also « work », which means it starts without another error than the one above. In the Services window, you cannot find ROLI Hardware driver. First step is then to reinstall the service, as follows.
1. Press the Windows key and search for the command prompt.
2. Right click on the Command prompt text and select Run as administrator. You will have to approve the request to run the process as an administrator.
3. Type cd « \Program Files\ROLI\ROLI Hardware Driver » and press Enter. If the command fails, you have a different issue. First try to uninstall ROLI Dashboard and ROLI Connect, then reinstall ROLI Connect, and come back at this.
4. Type svc and press Enter. This command will only work if you run as an administrator, otherwise you’ll get an error message.
5. Press i and Enter, to install a new service.
6. As a service name, enter ROLI Hardware driver.
7. As display name, also enter ROLI Hardware driver.
8. When asked if the service can interact with desktop, press n for No and Enter.
9. When asked for the start type, enter a for automatic, and press Enter.
10. Leave error control as the default, i by just pressing Enter.
11. For the binary path, enter exactly what follows: "\Program Files\ROLI\ROLI Hardware Driver\srvstart.exe" "ROLI Hardware Driver" -c "\Program Files\ROLI\ROLI Hardware Driver\srvstart_rhd.ini"
12. svc asks for confirmation about the binary path it finds suspect. Just responde y for yes and press Enter.
13. Press y and Enter when asked if the service must be started as the LocalSystem account.
14. Press n and Enter when asked if service has dependencies.
15. When asked if you are sure you want to install the service, press y and Enter.
16. When asked for a new action, press x and Enter to exit.
17. Type exit and press Enter to close the command prompt.
After all these steps, the ROLI Hardware driver service should be present in the list of services, when you press Windows key, and search for Services. Right-click on the ROLI Hardware driver and select Start. The service should start after a few seconds.

If the driver runs correctly, ROLI Dashboard will show your devices or, at worst, a screen similar to the following asking you to connect a device.

The MIDI driver

Making the Hardware Driver work is just part of the solution. After you reinstall the driver as a service, you may notice that the ROLI Dashboard only works partially. As soon as you start using your ROLI device with Studio Player or your DAW, you will quickly notice that Dashboard stops working. This is because access to the ROLI device is exclusive to one process. If you want to switch mode of your Lightpad, you will most likely need to unplug it, replug it, and try again, until Dashboard gets control of it, then you’ll have to unplug and replug until your DAW or Studio Player gets control, etc. This can be solved as follows!

First press the Windows key and search for program. Go to add or remove programs and search for ROLI.

If you don’t see roliMIDI64, you need to install that manually; this will fully solve the ROLI Dashboard issue.

For this, press Windows key + E to start Explorer and go to C:\Program Files\ROLI\ROLI Hardware Driver. Double-click on ROLIMultiClientMIDI64.msi and let this install.

That’s it. Problem solved. Now your DAW, Studio Player and Dashboard can see your ROLI devices and access the ROLI-specific settings again.
10 septembre 2022
JAXB puzzles with Java 11
JAXB, which means Java Architecture for XML Binding, is a library that can be used to persist Java objects as XML documents. XML is a text-based format that can be used to represent complex and hierarchical data. JAXB can unmarshal a XML document directly into objects whose classes are defined by the developer, rather than letting the developer parse a generic abstract representation such as the Document Object Model (DOM) into domain-specific
objects. JAXB can also marshal objects into XML documents. This can be used to persist data in XML, a text-based format independent of the platform, as opposed to some alternatives like Java object serialization.

For an introduction to JAXB, see for example Guide to JAXB.

Object mapping is not specific to JAXB. The Jackson library can be used to parse JSON documents directly into objects, while SnakeYaml can do the same for YAML. These libraries are not essential but very handy. Instead of just parsing the XML, JSON or YAML files into an abstract tree that then needs to be traversed by tedious and repetitive code, they directly map the data to business objects, performing some degrees of validation in the process.

However, these libraries are quite problematic when they stop working after upgrading to a newer Java version. This is exactly with happened with JAXB past Java 8. Fortunately, solutions exist, but they don’t always work.

This post summarizes the known problems and proposes solutions. We first start with the classic dependency issues due to removal of JAXB from JavaSE, then go on with a less common issue that can arise in applications involving multiple class loaders.

Somewhat know but not fully solved problem

There are many, sometimes misleading, posts and forum questions about JAXB. Here are a couple of examples:
- JAXB on Java 9, 10, 11 and beyond: a good description of how to solve the basic compilation and runtime problems
- Implementation of JAXB-API has not been found: shows that for some people, the simple solutions don’t work, and we don’t fully get why.
- Invalid JAXB default factory class on JavaSE 11: apparently not fully solved issue, that persists for several years.
These posts lead me to believe not enough effort was spent in addressing the issue. This could be explained by many developers moving off XML, in favor of other formats like JSON or YAML. XML is a bit too verbose and many parsers must be explicitly configured to disable external entities to avoid some security vulnerabilities. JSON is a lot simpler format, but it doesn’t support comments by default. YAML is a bit less verbose than JSON and also supports comments, but its indentation-based syntax can be misleading.

In our case, we were stuck with XML-based projects our component had to continue loading. Migrating to JSON or YAML would break backward compatibility and providing migration tools would have been as tedious as parsing the XML documents without JAXB. Maybe a migration tool would be easier to write in Python, but such an offline tool would create complexity. If we have to switch project format, the Java code needs to be able to load the old XML-based projects and the new ones, no offline tool to convert XML to something else.

Compilation errors due to JAXBContext not found

The first thing that occurs after migrating a JAXB-enabled program from Java 8 to some newer versions is a compilation error because JAXBContext cannot be found anymore. This is because JAXB got turned into a module in Java 9 and it is not on the module path by default. Although this can be solved, it is better to explicitly add a JAXB library as a third party dependency since JAXB got removed completely from Java 11.

Besides the JAXB-API itself, the program needs an implementation. The most common JAXB implementation is Glassfish.

Here is an example of Maven dependencies for JAXB.
```
<dependency>
    <groupId>jakarta.xml.bind</groupId>
    <artifactId>jakarta.xml.bind-api</artifactId>
    <version>2.3.3</version>
</dependency>
<dependency>
    <groupId>org.glassfish.jaxb</groupId>
    <artifactId>jaxb-runtime</artifactId>
    <version>2.3.6</version>
    <scope>runtime</scope>
</dependency>
```
There are more recent versions of JAXB, but they change the package names of the classes. The 2.x is the closest to what was provided in Java 8.

Setting the scope to runtime for jaxb-runtime is not strictly necessary, but is a good idea to help IDEs proposing meaningful code completion. No code should directly refer to JAXB implementation classes. Code should only interact with JAXB through its public API. This allows the implementation to be switched if need be.

javax.xml.bind.JAXBException: Implementation of JAXB-API has not been found on module path or classpath.

Main cause is the absence of a JAXB implementation on the class path. This can usually be solved by adding a dependency to your build file. See above for an example for Maven.

If the error persists, next step to investigate is obviously to look at the cause in the stack trace. There is a common but misleading cause:
```
Caused by: java.lang.ClassNotFoundException: com.sun.xml.internal.bind.v2.ContextFactory
```
Inspecting Glassfish JAXB implementation JAR, one can find that the implementation name is com.sun.xml.bind.v2.ContextFactory, without the « internal ». It is then tempting to believe there is an issue in JAXBContext.newInstance() using the wrong default factory class. However, this is not a problem for JAXB API 2.3.3 at least. I had to dig into the source code of JAXBContext to verify this.

JAXBContext.newInstance(), from JAXB-API, applies several strategies to search for a JAXB implementation, and it falls back on trying to load the default class present in Java 8 only if everything else fails. If JAXB implementation is present in the class path, JAXBContext.newInstance() should find it and not fall back on the missing Java 8 default.

Class loader intricacies

What if despite the fact you checked, double checked, triple checked, asked other developers to check, double check, triple check, that the JAXB dependencies are correct, and your program keeps complaining about JAXB implementation that cannot be found? Checking again like many forum posts suggest won’t help in some cases. Trying different versions of the JAXB artifacts could help, but doing that in a tentative blind way is likely to be of no help.

In a nutshell, you need to make sure JAXBContext.newInstance() is called at a place where the context class loader is correctly set up. Calling JAXBContext.newInstance() from a worker thread spawned by the Java Executor Service or ForkJoinPool can cause issues. It is better to construct the JAXBContext instances you need at the beginning of your program’s execution, store them in static variables and reuse them instead of recreating them again and again. Your program will benefit from an almost free performance boost when loading from or saving to XML, and you will be less likely to get JAXB issues. The JAXB problems, if any, will pop fast, right at application startup, rather than later on after the application receives requests.

Understanding the reason for this is not obvious and requires digging into Java class loaders. When a Java application requires a new class or resource, it uses a class loader to search for it. Most class loaders look for resources at well-defined locations named the class path. Locations can be directories, archive files (with .jar extension) or URLs to archive files (the contents will be downloaded as needed).

Simple applications started with java command line and running a main method from a class have a single class loader. This loader is created at startup, using a class path coming from the CLASSPATH environment variable or passed through the command line using the -cp option. Java applications started with -jar option also have a single class loader. If JAXB can be verified to be on the class path, the application should work correctly.

Problems arise when the Java Virtual Machine deals with multiple class loaders. Spring Boot applications can have more than one class loaders if they create multiple application contexts. Even running a Java program through the Maven Exec plugin results in a second class loader being created for the execution.

When there are multiple class loaders, a question should come up: how JAXBContext.newInstance() chooses a class loader to search for a JAXB implementation? There are unfortunately multiple possibilities, and this is not necessarily the one you think about.
1. JAXBContext.newInstance() has some overloads accepting an explicit class loader. In that case, that class loader will be used to search for the implementation. But being in control of the class loader doesn’t necessarily mean you will know the correct one to pass. Moreover, not all forms of newInstance accept a class loader, e.g., the one taking XML classes to bind to doesn’t.
2. Some developers, including me, could think that JAXBContext.newInstance() will use the class loader bound to JAXBContext. This class loader can be retrieved easily, using JAXBContext.class.getClassLoader(). If JAXB-API and JAXB implementation are on the same class path, the class loader that loaded JAXBContext should be suitable to find the implementation. But JAXBContext.newInstance() gets its class loader another way. See 3.
3. Using the current thread’s context class loader. This can be retrieved using Thread.currentThread().getContextClassLoader(). It can be confirmed, checking at source code, that this is how JAXBContext.newInstance() locates the JAXB implementation.
Wrapper code such as the Maven Exec plugin or Spring Boot that creaes a custom class loader will take care of setting that class loader as the current thread’s context class loader, using Thread.currentThread().setContextClassLoader(). However, what if the wrapped code uses other threads?

If the wrapped code creates a Thread of its own, using new Thread(), the new thread will inherit the context class loader from its parent. However, it seems that things are different if threads from a pool are reused. Facilities such as the Executor Service and ForkJoinPool, offered by Java, can create threads, and it is not guaranteed these threads, that can be reused by multiple parent threads, will have the expected class loader. They may end up with the default class loader, and that class loader could be unable to locate JAXB implementation, unless it is baked into the Java standard library, like in Java 8.

This explains why a program that worked well with Java 8 starts to exhibit JAXB errors in Java 9 and onward.

Possible solutions
1. As outlined above, the ideal solution is to make sure JAXBContext.newInstance() is only called from the main thread and the JAXBContext instances are cached and reused. JAXBContext is thred-safe; only the Marshaller and Unmarshaller created by JAXBContext are not thread-safe.
2. If above fails, one quirky possibility is to temporarily set the context class loader to something that could locate JAXB implementation, then reset the class loader. Here is an example of this.
```
ClassLoader oldContextClassLoader = Thread.currentThread().getContextClassLoader();
JAXBContext jc;
try {
   Thread.currentThread().setContextClassLoader(JAXBContext.class.getClassLoader());
   jc = JAXBContext.newInstance(MyXMLObject.class);
} catch (JAXBException e} {
   // Do something, this is a checked exception.
   // Lazy developers sometimes do this and that's not ideal.
   // If you do this, at least pass the thrown exception as the cause.
   throw new RuntimeException("An error occurred", e);
} finally {
   Thread.currentThread().setContextClassLoader(oldContextClassLoader);
}
```
Another very bad solution may be to directly call ContextFactory.newInstance() method from the JAXB implementation. This is not great, because it ties your code to a specific implementation, but it is at least better than getting rid of JAXB.
25 juin 2022

Many ways to create the same thing

One important issue of software development is that problems can be resolved many ways and not all solutions are equal. Some solutions cannot be maintained because some developers cannot grasp how it works and insist on switching to something else. Some solutions cannot be deployed because the chosen providers don’t support it. Some solutions don’t adapt to the requirements of the customers or new requirements coming after the fact.

Following table summarizes the problems happening again and again in software development.

Developers	Customers	Hosting providers
Limited knowledge of available solutions	Implicit requirements taken for granted	Limited set of platforms
Insist on using the solutions they know about and have others adapt and rewrite	Limited tolerance to errors in interpreting requirements	Too high cost for some customers
Narrow understanding of requirements	Limited tolerance to delays	Vendor lock in

Problems faced by software development

On one hand, developers know about some solutions and not others. They make their best at understanding requirements and designing something that works, but they cannot know everything. Usually, they are constrained by time so they prefer to go on with what they know rather than spend countless hours learning a new technology just for a new project. This is especially problematic when new developers, with different backgrounds, join a team. Each developer knows about different solutions and just cannot afford adapting to something new, so each will insist on moving everything to what he knows! This can slow down a project and even stop it altogether.

On the other hand, customers present requirements in a shallow way, taking many things for granted. When something granted is not implemented, this usually results in negative reactions making developers upset and, feel guilty and urged to fill the gap. Any additional delay is seen as a catastrophe that threatened the project, the business relationship with the customer and sometimes even the position of the affected developers. This results into more and more hacking and patching reducing the quality of the solution, causing other requirement mismatches, bugs and disappointment, for both customers and developers.

Then the provider deploying the solution imposes constraints as well. Sometimes, the platform is limited to only some technologies the developers may be even unaware of. Other times, the deployment cost is prohibitive for the customer. Some providers, especially cloud services, have specific terminology and system architectures making it hard to switch from one provider to the other without major refactoring, unless these problems are known in advance.

This article illustrates these problems using a very simple use case: a web application showing a clock to the user. There are many ways of implementing a clock and each solution has benefits and drawbacks. Choosing the wrong clock implementation is likely to require a rewrite of the whole system. While reading about this use case, think about how the complexity of real-world projects such as GMail, Facebook, Netflix, Amazon, etc., are orders of magnitude greater than a clock.

Initial requirement

Suppose we want a webpage showing an analog clock to the user. The clock needs to reflect the current time.

Static web page

The very simplest basic way of implementing the clock is using a simple Web page, represented as an HTML file, that will load an image showing the clock. This works in any browser, even the oldest. Some naïve developer may think the problem is solve. But how about showing the current time?

For this, one possibile solution is to create one image of the clock for each position and enhance the HTML page with a piece of Javascript code that would load the correct image based on current time. There will be many images, one for every minute of every hour, namely 720 images.

Some developer will go over the top, creating all these 720 clock images in a graphic program, managing to come up with a naming scheme and implement the piece of Javascript code that will pick the right image based on the current time. Then guess what? The customer will look at this and complain: he wanted three needles, not just two: one for the hours, one for the minutes, one for the seconds. So we end up with 43200 images of a clock to create! Even worse, the customer dislikes the layout, so all the initial 720 images need to be revised anyway.

Then after some thoughts and discussions, developers will agree on the need to create a program that will generate all these images. This could be done ahead of time, but why not have the HTML page call that program itself? This is perfectly doable, by replacing the static image with a web service that will return a dynamic image.

That looks simple, but there are many different ways of implementing that backend service: a CGI script (the old way), PHP (another old way), then others will come up with a proposition to implement everything from scratch using Django, then cloud developers will propose on using serverless technology such as AWS Lambda. After the programming language is determined, there are several libraries to create images.

While developers argue on what backend service to use, the customer discovers with great surprise that the clock image appears very tiny on his 4K monitor. Who thought somebody would use that page on an high resolution display? Bitmap formats such as JPEG, GIF or PNG with fixed resolution won’t do the trick. The system needs to serve a vector representation of the clock, most likely using SVG. Ah no, this doesn’t work with certain old browsers! Are we sure the customer doesn’t use such as an old browser? We can ask the customer, he will probably say no, not even sure he will understand the question, then figure out he has an old PC stuck with Windows XP and an old Internet Explorer not supporting SVG, but that will appear after the fact, after the SVG backend is in place!

We can discuss forever about the backend service to figure out that first, the customer wants to deploy on something only supporting the old PHP or, even worse, that we forgot something: the clock needs to tick with current time, not just render the current time at the time the page is loaded! While this seems intuitive, developers, overwhelmed by all technical details, may loose track of this obvious fact. A clock ticks, needles move!

Some hacker will come in, proposing to just refresh the whole page every second. This will result in something ugly that flickers. The customer may agree on this, but will of course reject the solution when he sees the flickering result!

At this point, it becomes obvious we are in a dead end. Static page won’t do it, we need something more dynamic.

Dynamic web page

There are many ways to render a clock dynamically on a web page nowadays, using Javascript code. One can write code generating SVG drawing the clock and needles, another will prefer using the HTML5 canvas (and will have to argue with the one that prefers SVG because of his background or former job), another will propose using a library such as Raphael instead of builtin SVG or HTML5, and another one may even propose using Flash even though it is deprecated, because supposedly this is easier and more intuitive, and that works with more browsers.

Then how about having the clock move? Some people will advocate in favor of directly manipulating the document model using Javascript, but that won’t work in a portable way across browsers. Most developers thinking about this solution will immediately shift his mind to JQuery, which is a widespread library allowing to manipulate a web page dynamically in a way portable across browsers. Then a developer with background in data processing will propose using D3 library to make the needles move. JQuery and D3 works in very different ways, they implement different paradigms. Switching from one to the other is not so obvious, but the D3 developer expects the seasoned JQuery programmer to use D3, and the JQuery developer would like to get the D3 guy on JQuery.

Then seasoned frontend developers or people that want to become one will go with the heavy lifting, proposing using frameworks such as Angular JS, React, VueJS, etc. Yeah, just to render a clock! Such frameworks will do the job, at the expense of a lot of complexity, but they can do much much more. A React-based page showing a clock could allow customizing its style on the fly, and the modified style would apply immediately, without flickering. If one decides to add multiple clocks at different time zones on the same page, that will be possible without breaking everything.

However, no developers know all frameworks. A newcomer will likely know something different and then start advocating in favor of switching to what he knows. « I don’t know much about React, but that seems easier to do with VueJS. » The other will prefer Ember, another one AngularJS, etc. All of these frameworks will bring development to a stop if each developer wants to use his own framework.

« Modern » web application

Maybe I should not write « modern », because at the time this article will be read, this may not be modern anymore. In the previous section about dynamic web page, what happened with the backend? Well, it is not needed anymore. Everything can happen on the client side. Now with Javascript and HTML5, web pages can do much more work than before.

One reason a backend may be necessary is persistence. Suppose we want to allow customize the clock and save the settings to a user profile that would be reused on different devices. This simple requirement would make the project a lot more complex, requiring persistent storage (thus a database) and some kind of identification system to know which profile to load. For clock settings, authentication is probably overkill, but it is a necessity for most real-world applications, and that is a huge can of worms.

Things become complex quickly, making it hard for developers to do it right. There are (too) many different solutions. One may prefer to write the backend service using Java, and Java offers tons of frameworks such as Spring MVC, Jersey, etc., for this. Another one will use Python with Django, Tornado, Flask or something else. Yet another one will lean towards NodeJS with Express or some other library. Another one may prefer to keep using PHP or even write something in C++! There are also many possibilities for the database system, such as MySQL, PostgresQL, MongoDB, Redis, etc.

That even doesn’t take into account bold developers that don’t trust anything and absolutely want to implement everything from scratch, using the most basic stuff he can find. Let’s make this backend in C, or even in assembler, and have control over everything!

No matter what is chosen, it will work somehow and will fail somehow. Developers will be stuck with the issues and if they cannot find a solution, others will advise using some other technology.

While dealing with all these options and technical details, many will forget something essential: write automated tests! While that seems unnecessary at first, this becomes fundamental as the complexity of the project increases. Automated tests help going forward with a reasonable confidence that what previously worked still works.

The shocking tendency to give up

By the time developers try to implement this stupid clock, discuss and argue about the best frameworks, the customer will get nothing more than explanations, nothing concrete. At some point, he will just go to a store and buy a watch. Here is my clock. It works, it does what is has to do, it ticks with current time. While the idea was great, it was too long and complex to implement, so giving up and going elsewhere is the way to go.

Another example of this could be a customer requesting a web application to collect data and, after some delays or failures, decide to fall back on a simple Excel (or maybe better maybe not LibreOffice Calc) spreadsheet.

Conclusion

While having multiple solutions available is fine, too much can be worse than not enough. When there are too much options to choose among, not enough guidance and error requires costly refactoring, even rewrite everything from scratch, progressing is hard. We end up redoing the same thing, over and over again. The fact that many people implement the same thing over and over again in silos, independently of others, seems not to help, but gathering everyone as a super large team will cause endless argumentation leading all the development to a full stop.

Maybe developers should spend more time using what is there rather than making it work the way they think it should or reinventing the wheel. Maybe customers should spend more time reflecting about their requirements, discuss more with developers to figure out the consequences of these requirements over the project. Maybe providers, especially web hosts, should stop sticking to obsolete technologies such as PHP and offer facilities to host modern web applications in addition to the old stuff that may still be needed for existing projects, or cloud providers should offer options allowing deployment of web applications at the same cost as simpler web hosting platforms (no virtual machine to run a simple NodeJS server, for example). Or maybe I am wrong, getting out of my mind because all that is going on in the world.

21 mars 2021

The down side of a NAS
Recently, I almost lost a bunch of ripped blu-rays, DVDs and downloaded movies and TV series. I thought a RAID5 would preserve that reasonably well, but I didn’t consider carefully enough recovery scenarios in case the whole NAS dies. I learned how a NAS can be expensive and paint myself in a corner, but I also learned more about logical volumes that could greatly improve my Linux installation in the future.

A NAS: a great idea at start

I used to store movies and TV series on separate drives. After I almost filled up 2 3Tb drives with ripped blu-rays, I wanted to do better than just adding another drive. Otherwise, I would end up checking all three drives to find out a given movie or TV series. I thus needed a solution to combine the storage space into a single pool.

One way of doing that is a Redundant Array of Inexpensive Drives (RAID). This can use several drives to improve reliability (mirroring), performance (striping) or both. This also combines the space of multiple drives, resulting into a virtual device with more space. I wanted to go with a RAID5, because that resulted in more space, with the possibility of one drive dying without loosing access to the data. Of course, if a drive of a RAID5 dies, you need to replace it as soon as possible so the array can rebuild and be reliable again, otherwise, if another drive fails, data loss occurs. The larger the storage pool is, the more catastrophic is the data loss!

Unfortunately, at the time I investigated that, there was no user interface in Ubuntu I knew to set up a RAID (that may have changed). One needed to copy/paste obscure commands from web pages. I got tired of such processes and wanted some easier way, other than trying with Windows because I don’t have a Windows license for my HTPC with the storage on it.

I thus wanted to experiment with a Network-Attached Storage (NAS) device. I picked the TS-453Be from QNAP, because I wanted a 4-bay device that is not too expensive. I found a lot of 2-bay devices which would have prevented me from building a RAID5, forum posts suggesting that RAID5 is not great because all data is lost if more than one drive dies, that a mirrored RAID is better, etc. Because I didn’t have 4 drives yet, I started with 2 3Tb drives with the project of expanding to four.

One good side of these devices is their ability to be configured remotely. Instead of hooking the device to an HDMI port (but you can do so if you want to) and interact with it using a screen, you can point a web browser to its interface and configure via a web browser.

Downside is the quantity of settings that made little sense to me, and some still do remain obscure, like the JBOD and iSCSI. It was relatively easy to get a RAID setup, but I quickly noticed that only a portion of the space available on the drives was usable by the volume containing the data. This is because the physical space can be used several ways by the NAS, not just for storing data. One can create snapshots or split the space into multiple logical volumes. Volumes can also host virtual machines. I had an hard time finding how to expand the logical volume with my contents because the option in the UI was hidden away, and it was called « Étendre » in French, which I confused with « Éteindre », meaning Power off. But everything was there to expand the logical volume, but not shrink it afterwards.

A couple of months later, I found out it was possible to convert the mirrored RAID into a RAID5 after I added two new drives. That gave me 9Tb of storage space, enough for the moment to store my stuff.

Getting more would however be painful, time consuming and expensive, requiring me to swap out each 3Tb drive with larger ones, one drive at a time. Each time a drive sawp would occur, the RAID would have to rebuild from the three remaining drives, and only after the rebuild is complete would I be able to swap another drive. After all drives are swapped, the storage pool would expand based on the size of the smallest drive.

Then what’s the point? Why not copy all the data elsewhere, rebuild a new array and copy the data back? The longer process has the benefit of leaving the storage pool online throughout the whole migration. Data can still be accessed and even, at least as far as I know, modified! A dying drive also doesn’t bring the pool offline. The degraded RAID can still work, until a new drive is added. The rebuild occurs online, while data can still be accessed. I realized that this high availability was overkill for me, but for a business, that would be critical.

An hybrid backup strategy

I didn’t have enough storage space to back up the whole RAID5 array. Instead of solving that, I went with an hybrid strategy.
- My personal data is already backed up on Dropbox and present on two machines at home. My idea was to make my NAS the second Dropbox client, but I quickly noticed that Dropbox doesn’t run on QTS, the Linux variant installed by QNAP on the NAS. I thus needed to keep my HTPC for that. Dropbox allows me to have an offsite backup, which would be handy if a disaster such as a fire destroyed my home. But thinking more, after such a disaster, getting back my photos, videos, Live sets, etc., would be kind of minor concern.
- I have hundreds of gigabytes of Minecraft videos I authored and uploaded on YouTube but wanted to keep an archive of these. I ended up uploading these to an Amazon S3 bucket, which is now moved to Glacier. This allows to save the data for a low price but requires more time and some fees to get the data back. I thought the NAS could itself synchronize the MinecraftVideos folder with S3. No, it cannot run S3, not anymore! QNAP switched gear, giving up on S3 in favor of inter-NAS synchronization! That means if I want an automated backup, I would need to buy a second NAS with the same (or higher) storage space, and set it up to be synchronized with the first! For a small business, I can imagine this possible, but for a home user, that looks like overkill.
- I thought that not backing up the ripped DVDs and blu-rays would not be a big problem since I have the originals. This was a big mistake.
- More and more recorded videos added up, with no backup. Using my HD PVR from Hauppauge, I recorded several movies from my Videotron set top box.
My backup plan was thus outdated and needed some revisions and improvements.

Noisy box

The NAS, in addition to its lack of integration with Dropbox and S3, was quite noisy. Sometimes, it was quiet, but regularly, it started to make an humming noise, and it wouldn’t stop unless I power it off or tap on it a few times. I searched a long time to solve this. I tried to screw the drives instead of just attaching them with the brackets, but the problem was that one of the drives (a new drive) I put in the NAS was bad and noisy! Switching that with another 3Tb drive I had fixed the issue.

The noisy drive, I moved it into my main computer and it was relatively quiet at first. But it ended up making an annoying humming sound that my mic was picking, reducing the quality of many of my Minecraft videos. At some point, I got fed up and decommsionned that drive, in favor of the last 3Tb drive remaining in my HTPC. My NAS combined with a NVIDIA Shield was pretty much replacing my HTPC which I was more and more thinking about decommisioning.

The disastrous scenario

During 2020 summer, my QNAP NAS suddenly turned off and never powered on. When I was turning the unit on, I was getting a blinking status light and nothing else. At first, I was pissed off, because I wasn’t sure I would be able to fix this and was anticipating delays to get that repaired because of the COVID-19 pandemic. My best hope was to recover the data using a Linux box, then I would decide whether to get this NAS repaired or replace it with a standard Linux PC. Recovery eneded up to be harder than I thought, which made me angry.

When I lost access to all my files, I noticed how time consuming re-ripping the blu-rays and DVD will be. I would need to insert each disc into the blu-ray drive, start MakeMKV, click the button to analyze the disc, wait, wait, wait, then enter the path to store the ripped files to. Even though there is a single field text can be entered in, MakeMKV doesn’t put it in focus, forcing me to locate the super small mouse pointer (at least on my HTPC where I had the blu-ray drive to rip from), click, enter the path, check (and the font was super small), click again and then wait, wait, wait. For one disc, that’s OK. For 30…

I also lost a bunch of movies recorded using my HD PVR. The quantity of recordings increased over time and I didn’t realize none of this was backed up! Backup plans need to evolve and be revised over time.

Recovery attempts

First problem was to move the four drives into a computer that would be able to host them. I didn’t have enough free bays in my main computer, not without removing another drive. I was worried Windows would get screwed up by this and wouldn’t re-establish broken links to my Documents, Music, Videos and Pictures folders. I thus chose to use my old HTPC as a host, but fitting and powering the four drives was a painful procesess. I had to use pretty much all the SATA cables I had, and one broke during the process. I had to unplug the hard drive in a really old PC to get the SATA cable, unplug my DVD drive in my main PC to get the SATA cable, found one other SATA cable in a drawer, etc. I also needed a molex to SATA converter cable because the PSU only had four SATA power cables. I needed to power five drives: the four NAS drives to rebuild the RAID array, and the SSD containing Ubuntu! Because of the pandemic, I wasn’t sure the computer stores in my area were open or not, so my best bet to get new hardware was online, with delays of several days, even for something as simple and stupid as a SATA cable. Using what I have was my best option.

All these efforts were pretty much worthless because I wasn’t able to access the data. We’ll see why later on. All I could do is trigger a SMART long self-test, to at least verify the drives were good. All four drives passed the test. No need to get the NAS fixed or continue recovery attempts if more than one drive had failed that test.

I couldn’t go further without ordering hardware. I started with a four-bay USB to SATA docking station. Finding one was first tricky (one bay, just for Mac, etc.), but I got one and it worked like a charm. However, at first, it caused me issues: plugged the power cable in wrong direction, tested with a defective drive (yes, the noisy 3Tb drive I removed from my system just doesn’t power up anymore!!!), but it ended up working.

I was hoping to put the four NAS drives in there and have Windows see the drives, then I would try using ReclaiMe to get the files back. I also needed to get a large enough drive to hold all the data. I got a 10Tb one, that would do, I thought.

Since I was able to reassemble the block device using MDADM, I explored the idea of dumping that block device to a partition on the new 10Tb drive. For this, I had to plug the 10Tb drive to my HTPC, which already was missing connectors for 5 drives! I used the USB docking station for that. Bad idea: the HTPC is too old, having just USB2, and copying 10Tb through USB2 is a good way to strengthen your patience, and get you annoyed. It would have taken more than 2 days just for that step to finish! There was no solution, unless I got a PCI Express card with a USB3 port or probably better eSATA port, and then I would need to get a eSATA drive enclosure to put the 10Tb drive in! I didn’t like this at all, because that was a lot of waiting for an intermediate result only.

After the failure to create the block device because of USB2 slownesss, I got super fed up and decided to proceed with the repair of my NAS. I contacted their technical support and we figured out that a repair was needed and my warranty was over since a year. I thus needed to pay 300$, plus shipping of the NAS. I felt at that time it was my last hope of recovering the data.

But I tried anyway to get the drives out of my HTPC and plug them into my docking station, so ReclaiMe could analyze them. Well, RelciMe completely failed. It detected the logical volumes on the drives, which looked promising, but instead of making use of the Ext4 file system, it just scanned the drive for data looking like files. It ws thus just able to extract files with no name but some contents, even unsure if the files were complete! That would be unusable garbage, better off just re-ripping all the discs. R-Studio, another tool I tossed at this, also failed miserably, not even able to reassemble the RAID5. I got so fed up that at some point I considered contacting a data recovery company, but I was concerned that it would be too expensive, and I would just get chunks of data named with hash codes like I was about to get with ReclaiMe.

Lasagna file system

QTS makes use of multiple layers when configuring a file system. Each layer provides some features, but all of this is adding complexity at recovery time.

The diagram below summarizes the structure.

First, the physical drives are combined into a RAID5 using the DMRAID module of the Linux kernel. This allows to create a RAID array in software, without any specialized hardware. The MDADM tool can be used to configure or activate a RAID. I was able to activate the RAID and get a block device out of it.

The block device could be used to host a filesystem. Instead, it is formated by QNAP as a DRDB storage pool, at least according to forums I searched on. Some people attempted to mount the DRDB device without success, because QTS uses a forked version of DRDB preventing anything other than QTS to read it! Because of that, only a QNAP NAS can reassemble the RAID5 and get data back!

The DRDB volume is formatted as a physical device for the LVM system. Logical Volume Manager (LVM) allows to split a storage device into multiple logical partitions. Partitions can be resized at runtime and don’t have to use contiguous space on the physical volumes. They can span multiple physical volumes as well. This is something any ordinary Linux distribution supports, as this is part of the mainline Linux kernel! Only caveat is the absence (at the time I am writing) of user interfaces exposing these. One needs to use command lines such as pvcreate, vgcreate and lvcreate to manipulate the logical volumes, but the commands are not as complex as I thought.

I read that QNAP also forked LVM, so I was worred that even if I got past the DRBD layer, I would not cross the LVM one.

Note that when I partitioned my 10Tb drive, I found a LVM partition on it! The assembled RAID apparently was a LVM, so maybe I would have been able to vgimport it and get access to the logical partitions! However, any attempt to do so would have failed or changed the machine id in the volume group, reducing chances my fixed NAS mounts the array and filesystems. My new 10Tb drive was already formatted at the time, with some data on it, so couldn’t use it to back up the full RAID and test, unless I get another drive. I thus decided to stop my attempts there since my NAS was shipped to QNAP at the time of that discovery.

Below the LVM layer, there are logical partitions, at least one large with the files I wanted to recover. The Ext4 native Linux file system is used here. That is used to organize the space into files and folders. Recovering all the data requires handling the Ext4 filesystem to get the full file contents back, not just portions of files with no names.

Full recovery

I got the fixed NAS back and inserted the drives in. The NAS powered up and recognized the drives as if it never got broken. I was thus able to get my files back, everything was there. Because the recovery process was so painful and expensive, I didn’t feel any victory, just a bit of relief that this was over.

While waiting for the fixed NAS, I formatted my 10Tb drive. I experimented with logical voluems, ccreating several partitions on the drive: one for Minecraft Videos, one for movies, one for TV Series, one for a full copy of my Dropbox folder, one for my music files, etc. Using the Ubuntu wiki, it was simple to create the logical volumes and then I started copying files on them. I was ready to transfer the contents of my NAS on the new disk when I got the NAS back.

Even though I recovered my data, I will probably have to re-rip some DVDs that are unreadable by Kodi. The VIDEO_TS structure is causing a lot of headache for pretty much all Linux-based players. VLC seems the most versatile one able to read most DVDs, but sometimes, I needed to use MPlayer, Kaffeine, etc. I remember that almost destroyed my dream of having all my DVDs and blu-rays on hard drives. Of course, Windows with PowerDVD or similar DVD player will work better, but I don’t want Windows on my HTPC, better return back to the sandalone DVD/blu-ray player and spend countless minutes searching for disks. MakeMKV should help solving that, because Kodi can read MKVs without issues. I may be able to convert previously ripped VIDEO_TS into MKV, saving me the trouble of re-ripping the disks.

After that bad experience, I came up with the plan of keeping the NAS as long as it would work, but back up all data on another drive. If the NAS dies a second time, then I would not need to recover any data and would just repurpose the drives, probably in a standard Linux PC.

Lesson learned: a NAS is for cases where you have multiple drives, more than four, if not more than eight, no usual PC can accomodate. It is relatively straightforward to get a standard ATX case that will host six drives, including SSDs, and getting a power supply unit with six SATA connectors is perfectly fine. Having to do it, I would probably explore the route of modular power supplies to reduce cable clutter, but even that is optional.

By trying to save myself some copy/pasting, I ended up with more pain and problems. Having spent at least half a day exploring logical volumes, at worst experimenting in a virtual machine, I would have figured out that my existing HTPC would have been able to combine my existing drives into a storage pool that can expand over time. If the HTPC dies, another Linux PC can import the volume group and things go on. Unfortunately, nothing can prevent disaster caused by failed drives other than back ups.

Other benefits of logical volumes

After I explored logical volumes, I am pretty sure I need them for my next Linux installation, because they will solve a bunch of fundamental issues I am getting again, again and again.
- Each time I perform an upgrade, I am running into the risk of a catastrophic issue making the whole system unusable. Ubuntu offers no downgrade path. If an upgrade fails, just reinstall from scratch. This is why several people suggest to not dist-upgrade but just reinstall clean, every time. Logical volumes alleviate that through snapshots. Before a dist-upgrade, I could just create a snapshot of the volume holding Linux, upgrade and in case of an issue, just restore the snapshot in a few minutes, no reinstall, no reconfiguration.
- Supporting upgrades, downgrades, multiple versions, multiple Linux distributions, all of this requires the home directory to be separate from the root file system. But each time I over partition my drive, one partition gets full and I have to either move data around, or restart computer and perform a repartitioning, which is time consuming and a risk for data loss (e.g., power outage while GParted moves data!). Logical voumes solve that, by allowing resizing at runtime. If I need more than 50Gb for my home drive, no problem, just claim some extents from the physical volumes, no need to be continugous space, and the resize occurs at runtime, without any unmount or reboot. I can keep working while the resize occurs. That’s really neat and powerful.
- Even the classical problem of expanding a drive is easier with logical volumes. LVM can move all data from one physical volume to another, transparently, at runtime, while I can work on the machine. Replacing a too small or end of life SSD is thus easier. Of course, sharing a SSD between Windows and Linux is always painful and problematic process, although perfectly possible. Dual booting Windows and Linux is itself a painful problematic process anyhow.
10 août 2020
An intricate sounding puzzle

All this story started on Wednesday, October 23rd 2019. When I turned on my subwoofer, I found out it was producing a constant noise, even without audio source playing. I verified the connector at both ends, everything was fine. My first reaction was to think the subwoofer was broken. It was too big for me to carry it over some repair shop, and not sure worth doing that, so I was planning to buy a new one. In the mean time, my sound system would still work, although without the subwoofer.

Following is a picture of the back of the subwoofer, with the connector plugged in.

The back of the subwoofer with connector plugged in

The humming sound my subwoofer was constantly making

The best solution to get a new subwoofer, I though, would be to go to the place I purchased my home theater system, but it was in Chambly, near my parents’ place. It would take weeks before I go back there. I thought about waiting but was tempted to check if I could order a new subwoofer online, on Amazon for example, or find a store in Montreal. However, many stores only sell full kits with everything included (A/V receiver, speakers, subwoofer), or all-in-one sound bars.

Finding the culprit: a ball game

After thinking more about it, before buying the new subwoofer, I wanted to make sure mine was really faulty. A year ago, I got issues with my A/V receiver which started to repeatedly power cycle without any possible solution. I replaced it with a new one but kept the old one in case I could find somebody able to fix it. Although flaky, the old receiver could be used to test my subwoofer. If it produces the noise with the old receiver too, I will know for sure it is broken.

To my surprise, the subwoofer worked flawlessly with the old receiver. Of course, it was just a basic test; I didn’t plug any audio source to verify it would produce sound, but the absence of the constant noise was already a progress. Unfortunately, chances were that the problem came from my A/V receiver again, which would force me to disconnect everything it from it, carry it to a repair shop, wait days, and then spend hours reconnecting everything. The speaker cables were quite hard to plug on this one, too close to each other. I didn’t want to redo this and was thinking about giving up and purchasing a dumb stupid sound bar instead. But I was quite reluctant to give up on my speakers.

At least, I was happy I didn’t order a new subwoofer or, even worse, go into the trouble of carrying one into the subway, from a local store back to my place!

Before going into the trouble of disconnecting the speakers, I tried unplugging devices, to no avail. Even unplugging the A/V receiver didn’t stop the subwoofer from making its constant humming sound! Of course, turning off or unplugging the subwoofer stopped the sound!

After more than one hour of plugging and unplugging, trying to replace the subwoofer cable to no avail, having trouble will stuff falling behind my TV, especially the AM/FM antenna of my A/V receiver and a small router configured as a switch to dispatch Ethernet to my HTPC and NAS, I found out that unplugging an HDMI cable from the A/V receiver stopped the humming sound! What? Which cable is that? A couple of crawling later, I found out: the cable linking my A/V receiver to my cable TV terminal. Trying to replace the cable didn’t change anything. I tried disconnecting the cable TV terminal to no avail. I don’t know why I thought about this but found out that disconnecting the coaxial input cable from the terminal stopped the humming. Of course, that would prevent the cable TV from working!

Next step: replace the cable. Unfortunately, I didn’t have a long or short enough cable. The only cable that worked was a 40 feet (too long) one, but the wires going out of the connector were too long, preventing me from tightening the connector into my terminal’s female port. Even worse, when the 40 feet cable touched the terminal’s connector the first time, the subwoofer briefly hummed. I was pretty sure purchasing and plugging a new long enough cable would produce the same effect as the original cable.

I was stuck, because the problem was coming from the combination of my subwoofer, A/V receiver and cable TV terminal. I could by trial and error replace the subwoofer, then the A/V receiver, to no avail! Contacting manufacturer for any of them would just bounce me the others! Only simple « solution » was to sacrifice one of the devices. Short term, it would be the subwoofer. Longer term, I was mentally preparing myself to give up on cable TV. I prefer a sound system over cable TV because I could get contents from elsewhere like Netflix, if of course I can have it played under Linux. I won’t purchase a Windows license for a 10-year old HTPC! Doing this would be a stupid non-sense, for me at least.

What changed?

But that setup was working before! What, on a sudden, changed, and caused it to fail? Well, it was a workaround for a cable signal issue!

I was getting issues with my cable TV terminal since a couple of day, complaining sporadically about low signal and decided it was enough. Wednesday morning, October 23rd 2019, I examined the connectors at both ends of the coaxial cable, they seemed tight, that should work.

The coaxial cable from the wall’s outlet connects into my power bar offering protection for coaxial input. Another cable goes out of the power bar and connects into my cable TV terminal. That worked like this for years, but it seems that every time somebody in my building is getting hooked up or adds a device, the cable signal weakens for everybody else. I thus thought that maybe I could work around by bypassing the power bar, just hooking the cable directly to the terminal. I tested it, that seemed to work, but just turned on the TV, not the subwoofer.

Then I went working and only later I turned on the whole system to watch a video on YouTube. Then I got the issue with the subwoofer and didn’t think at all it could be linked to the cable terminal.

I ended up, Wednesday night October 30th, reconnecting the cable to my power bar. This looks as follows.

Coaxial cable going in and out of the power bar

But then wait a second, wouldn’t that cause the signal issue to come back? Yes, it will. For now, I didn’t observe it again, but I know it can come back. This just gave me a delay, to think about something better, or to prepare myself for sacrificing the cable TV. I need, for that, to test Netflix on my Linux HTPC, which will be a frustrating experience of trial and error.

Why not contact my cable TV provider?

I thought about it and may end up doing it. The problem is that there is little they can do to improve the signal without accessing a technical room in the basement of my building, and getting the key to that room is problematic these days, because the company managing the building is overwhelmed and not responding to every inquiry.

Moreover, my cable company, Videotron, is moving towards Helix, a new technology combining everything into one device. Helix not only replaces the cable TV box but also the Internet modem and would constrain me to use a specific router I know little about. If I’m lucky, the router will just work as a regular device, offering Ethernet ports and 802.11n, hopefully 802.11ac wi-fi, with standard WPA2 and customizable network name and password. If I’m less lucky, network name will be hard coded and I will have to search forever among similar but slightly different names, and I may be limited to 802.11n 2.5GHz wi-fi (no ac, no 5GHz). If I’m really really unlucky, establishing a wi-fi connection will require a proprietary software which may work just on some devices! The problem is if I switch to Helix to get rid of my problems with my current Illico terminal, I won’t be able to know in advanced if all these quirks exist and will be stuck afterwards; it will be hard to go back. Concurrent telecommunication providers are either not offering cable TV at all, either adopting the same all-in-one modem/router strategy.

2 novembre 2019
Not all USB C ports are equal
I kind of already knew but not realized it, until I got my new Flex 15 from Lenovo. This laptop comes with one USB C port. Lenovo proposes a travel hub as an accessory. That hub fits into the USB C port, offering one USB 3.1 port, one HDMI output and one Ethernet port. Unfortunately, that hub won’t work, not because it is defective, but because it is not compatible with the Flex 15! This article tries to explain why and outlines some possible uses of the limited USB C port provided by that laptop.

USB C is the successor of USB 3.1, with a completely different connector. USB 3 connectors fit in USB 2 and even USB 1 ports, but USB C connector is not compatible with USB 3 ports.

The USB C connector looks as follows.

Example of a USB C connector

The port is also different, smaller than the USB 3 and the connector can be plugged both sides. The female connector looks like this.

USB C causes quite a bit of problems, because it is too new and because not all ports are equal, without any clear way of determining its capabilities. Main issues will be, for the time being, compatibility with existing devices and the different variants of host controllers.

New connector = new devices?

The first problem is compatibility. As the connector is different, a USB C port accepts only USB C connectors, but most USB devices at the time I’m writing this are USB 3, not USB C. This won’t be a problem on most systems. At worst, the USB C port will just be a cool artifact that will be never used, a bit like Firewire connector on some older laptops. However, there are some ultrabooks with just USB C ports, like Dell’s newer XPS models. I was quite shocked to see this and thought a user of such a machine would need to buy all new devices. That’s fortunately not the case, as I found out later.

Then, what can be done with that USB C port? First, a couple of devices have a USB C port and comes with a USB A to C cable. You can use that cable, or purchase a USB C to C cable to hook the device through the new USB C port. Examples of such devices are newer Android phones like the Google’s Pixel 2 and 3, V3 WASD CODE keyboards and the ROLI’s BLOCK and Seaboard.

I also found adapters turning a USB C port into a USB A; they look as follows.

A USB C to A adapter

Such adapters allow to use USB C ports like regular USB 3.1 ones, pretty much adding USB ports to the laptop.

Hubs also exist, but that’s were things become tricky and quite annoying. Some USB C ports are not compatible with all hubs!

Variants of host controller

The capabilities of the USB C ports depend not on the physical connector but rather on what is linking it to the system’s motherboard. We call this the host controller. As far as I know, there are the three following variants of such controllers.
- Thunderbolt 3. This is the most powerful and versatile port. A Thunderbolt 3 port can be used as a regular USB 3.1 port using an adapter, it can carry over display information and can transfer enough power to charge most laptops. You’ll need a hub or docking station to expose this; the connector itself won’t allow it. Thunderbolt 3 also allows the transmission of PCI Express lanes through a cable, pretty much offering extensibility to a system. You can for example hook up an external graphic card or high performance hard drive. The problem with Thunderbolt is that the previous standard relied on a different mini-DisplayPort cable, and many devices will be claimed as Thunderbolt-compatible; you won’t know for sure it is Thunderbolt 2 or 3 unless you look very carefully on the device’s pictures, search on forums, email the vendor, and maybe even then, you may get a Thunderbolt 2 device while expecting a Thunderbolt 3 and have to return/exchange it. Quite annoying. A device claimed to be Thunderbolt 3 compatible should be fine though.
- USB C with display and power. The port can be used for regular USB 3.1 devices (with an adapter or hub) or transfer display and power, enough power to charge a laptop through just that small USB C connector. Hubs providing HDMI output can be used with such ports. Some docking stations exist and can be used to charge a laptop and extend its display capabilities with one or two extra ports. Docking relying on proprietary connectors is now achievable with a generic, smaller, reversible connector. That’s quite amazing, and a USB C docking station should work with all laptops supporting USB C with display and charging, PC or Mac!
- Regular USB 3.1 only. USB C doesn’t require support for display and power (Thunderbolt 3 does), so some vendors ship limited USB C ports that can just be used with adapters or hubs providing only USB C or 3.1 ports! Trying to hook up a hub or docking station with display or charging capabilities on such ports will likely fail. Windows will report the device malfunctioned, making you think it is defective. But it is not; this is the port that is limited. But not defective, it will work with USB 3.1 devices (with proper adapter).
Lenovo’s Flex 15 has such a limited USB C port. Hooking up the travel hub, offered as an optional accessory while purchasing the laptop, will always fail. The hub wasn’t defective, I tested it on my work’s laptop that has a USB C port with display support, and it worked. But not on the Flex 15. All that can be done with the hub is a RMA. Sad but true.
2 novembre 2019
Misleading broken connector

Wednesday, October 23rd 2019, while troubleshooting an issue with my subwoofer, I discovered that one of my S/PDIF optical cable was disconnected; I found out the loose cable near my A/V receiver. This was previously used to send audio from my HTPC to my A/V receiver, before I got HDMI I/O on my new Yamaha receiver. I left the cable around in case the HDMI audio starts to fail, but it is kind of pointless as if I need to fall back on this, I would have to reconnect the HDMI cable directly on my TV as well. I wanted to either reconnect the cable or remove it altogether. However, I was unable to find back the optical connector on the receiver’s back. All I could find were RCA connectors.

Having a closer look at the back of the receiver, I discovered that one of the input had a broken connector in it. The connector looked like an RCA female but a bit different. The image below shows that connector.

Broken male connector in optical female port

After I discovered that and confirmed this was supposed to be the optical connector, I tried to remove the broken male connector from the female port using a plier. After a couple of attempts, I got the connector part back. It looks as follows.

Broken connector extracted from the port

The broken end of the cable looks as follows.

Optical cable with broken connector

I was tempted to put the broken connector back on the cable.

The « repaired » connector

By screwing the metal part back, it seems the connector is not broken. I didn’t test if it works. Maybe, or maybe it would be unstable.

The complete connector

That’s it. There’s nothing more to this!

2 novembre 2019
A dying laptop

My Lenovo Ideapad Yoga 15 is dying.

It happened the second time today, and that’s quite concerning. Suddenly, the machine freezes, keyboard is not responding, the mouse is moving but clicking does nothing. There is no way out other than rebooting, and when I did it, the screen became black with an error message displayed in white on a blue background.

The machine was telling me that there was no boot device or boot failed. What? This Lenovo laptop has a builtin mSATA SSD. I didn’t disassemble the laptop, ever, so why suddenly either the SSD became faulty, or the connection between the SSD and the motherboard is now unstable? Fixing any of these would require disassembling the laptop, with the risk of not being able to put it together again afterwards. The keyboard that needs to be removed to reach the components is clipped, and clips can break when trying to remove.

As the last time, during the holidays, powering off the laptop and booting up again « fixed » it. But who knows if that will last. Maybe it will fail tonight, maybe the next week, I don’t know. And next time it fails, maybe it won’t boot up again.

Besides of that, the battery lasts not really more than 2 hours and the wi-fi is now so clunky that most of the time, I need to plug in a USB to Ethernet or 802.11 adapter.

I’m wondering if that is Windows 10 that is killing this machine. As soon as I upgraded, the laptop became slower. A few months later, the battery life was reduced. If I could make Ableton Live work on Linux, I would attempt a switch to Ubuntu, but nope, no Linux version. All major DAWs have just Windows or Mac builds, no Linux. Things may change in 10-20 years, but that doesn’t matter to me, my passion may not be there at that time.

I’m kind of hopeless now to get any type of working laptop in the 10-20 next years. They all suck. Any candidate, any, I can find bad reports on some forums. People are slowly but surely stopping to use computers and/or migrating to Mac, which I cannot do because of font size issues. Maybe home computing is coming to an end, replaced by « smart » but limited phones and tablets not offering any efficient way of typing text. People are working around by posting meaningless pictures with any description, that’s what I see every day on Facebook: XYZ added on his story, no text, just a picture. I came to the point of disabling Facebook notifications on my smartphone and debating the possibility of closing my Facebook account.

But I want a solution. As a computer scientist, I just cannot give up having a computer, that makes just no sense to me.

9 février 2019