The Linux kernel config system, Kconfig, uses a macro language very similar to the make build tool's macro language. There are a few differences, however. And of course, make is designed as a general-purpose build tool while Kconfig is Linux-kernel-specific. But, why would the kernel developers create a whole new macro language so closely resembling that of an existing general-purpose tool?
One reason became clear recently when Linus Torvalds asked developers to add an entirely new system of dependency checks to the Kconfig language, specifically testing the capabilities of the GCC compiler.
It's actually an important issue. The Linux kernel wants to support as many versions of GCC as possible—so long as doing so would not require too much insanity in the kernel code itself—but different versions of GCC support different features. The GCC developers always are tweaking and adjusting, and GCC releases also sometimes have bugs that need to be worked around. Some Linux kernel features can only be built using one version of the compiler or another. And, some features build better or faster if they can take advantage of various GCC features that exist only in certain versions.
Up until this year, the kernel build system has had to check all those compiler features by hand, using many hacky methods. The art of probing a tool to find out if it supports a given feature dates back decades and is filled with insanity. Imagine giving a command that you know will fail, but giving it anyway because the specific manner of failure will tell you what you need to know for a future command to work. Now imagine hundreds of hacks like that in the Linux kernel build system.
Part of the problem with having those hacky checks in the build system is that you find out about them only during the build—not during configuration. But since some kernel features require certain GCC versions, the proper place to learn about the GCC version is at config time. If the user's compiler doesn't support a given feature, there's no reason to show that feature in the config system. It should just silently not exist.
Linus requested that developers migrate those checks into the Kconfig system and regularize them into the macro language itself. This way, kernel features with particular GCC dependencies could identify those dependencies and then show up or not show up at config time, according to whether those dependencies had been met.
That's the reason simply using make wouldn't work. The config language had
to represent the results of all those ugly hacks in a friendly way that
developers could make use of.
The code to do this has been added to the kernel tree, and Masahiro Yamada recently posted some documentation to explain how to use it. The docs are essentially fine, although the code will gradually grow and grow as new versions of GCC require new hacky probes.
It's actually not so easy to know what should and should not go into the config system. If we're probing for GCC versions, why not probe for hardware peripherals as well? Why leave this for the kernel to do at runtime? It's not necessarily clear. In fact, it's an open debate that ultimately could swing either way. Dumping all this GCC-detection code into Kconfig may make Kconfig better able to handle other such dumps that previously would have seemed like too much. The only way we'll really know is to watch how the kernel developers probe Linus to see what he'll accept and what would be going too far.
The Linux kernel has various debugging tools. One is the kernel function tracer, which traces function calls, looking for bad memory allocations and other problems.
Changbin Du from Intel recently posted some code to increase the range of the function tracer by increasing the number of function calls that were actually compiled into the kernel. Not all function calls are ever actually compiled—some are "inlined", a C feature that allows the function code to be copied to the location that calls it, thus letting it run faster. The downside is that the compiled binary grows by the number of copies of that function it has to store.
But, not all inlined functions are specifically intended by the developers. The GNU C Compiler (GCC) also will use its own algorithms to decide to inline a wide array of functions. Whenever it does this in the Linux kernel, the function tracer has nothing to trace.
Changbin's code still would allow functions to be inlined, but only if they
explicitly used the inline keyword of the C language. All
other inlining
done by GCC itself would be prevented. This would produce less efficient
code, so Changbin's code never would be used in production kernel builds.
But on the other hand, it would produce code that could be far more
thoroughly
examined by the function tracer, so Changbin's code would be quite useful
for
kernel developers.
As soon as he posted the patches, bug reports popped up all over the kernel in functions that GCC had been silently inlining. As a result, absolutely nobody had any objections to this particular patch.
There were, however, some odd false positives produced by the function tracer, claiming that it had found bugs that didn't actually exist. This gave a few kernel developers a slight pause, and they briefly debated how to eliminate those false positives, until they realized it didn't really matter. They reasoned that the false positives probably indicated a problem with GCC, so the GCC people would want to be able to see those false positives rather than have them hidden away behind workarounds.
That particular question—what is a kernel issue versus a GCC issue—is potentially explosive. It didn't lead anywhere this time, but in the past, it has led to bitter warfare between the kernel people and the GCC people. One such war was over GCC's failure to support Pentium processors and led to a group of developers forking GCC development into a competing project, called egcs. The fork was very successful, and it began to be used in mainstream Linux distributions instead of GCC. Ultimately, the conflict between the two branches was resolved only after the egcs code was merged into the GCC main branch, and future GCC development was handed over to the egcs team of developers in 1999.
Sometimes kernel developers find themselves competing with each other to get their version of a particular feature into the kernel. But sometimes developers discover they've been working along very similar lines, and the only reason they hadn't been working together was that they just didn't know each other existed.
Recently, Jian-Hong Pan asked if there was any interest in a LoRaWAN subsystem he'd been working on. LoRaWAN is a commercial networking protocol implementing a low-power wide-area network (LPWAN) allowing relatively slow communications between things, generally phone sensors and other internet of things devices. Jian-Hong posted a link to the work he'd done so far: https://github.com/starnight/LoRa/tree/lorawan-ndo/LoRaWAN.
He specifically wanted to know "should we add the definitions into corresponding kernel header files now, if LoRaWAN will be accepted as a subsystem in Linux?" The reason he was asking was that each definition had its own number. Adding them into the kernel would mean the numbers associated with any future LoRaWAN subsystem would stay the same during development.
However, Marcel Holtmann explained the process:
When you submit your LoRaWAN subsystem to netdev for review, include a patch that adds these new address family definitions. Just pick the next one available. There will be no pre-allocation of numbers until your work has been accepted upstream. Meaning, that the number might change if other address families get merged before yours. So you have to keep updating. glibc will eventually follow the number assigned by the kernel.
Meanwhile, Andreas Färber said he'd been working on supporting the same protocol himself and gave a link to his own proof-of-concept repository: https://github.com/afaerber/lora-modules.
On learning about Andreas' work, Jian-Hong's response was, "Wow! Great! I get new friends :)"
That's where the public conversation ended. The two of them undoubtedly have pooled their energies and will produce a new patch, better than either of them might have done separately.
It's interesting to me the way some projects are more amenable to merging together than others. It seems to have less to do with developer personalities, and more to do with how much is at stake in a given area of the kernel. A new load-balancing algorithm may improve the user experience for some users and worsen it for others, depending on their particular habits. How can two developers resolve their own questions about which approach is better, given that it's not feasible to have lots of different load balancers all in the kernel together? Wars have gone on for years over such issues. On the other hand, supporting a particular protocol or a particular peripheral device is much easier. For one thing, having several competing drivers in the kernel is generally not a problem, at least in the short term, as long as they don't dig too deeply into core kernel behaviors. Developers can test their ideas on a live audience and see what really works better and what doesn't. When that sort of freedom disappears, the closer you get to real speed issues and real security issues.
Recently, there was a disagreement over whether a subsystem really addressed its core purpose or not. That's an unusual debate to have. Generally developers know if they're writing support for one feature or another.
In this particular case, Johan Hovold posted patches to add a GNSS subsystem (Global Navigation Satellite System), used by GPS devices. His idea was that commercial GPS devices might use any input/output ports and protocols—serial, USB and whatnot—forcing user code to perform difficult probes in order to determine which hardware it was dealing with. Johan's code would unify the user interface under a /dev/gnss0 file that would hide the various hardware differences.
But, Pavel Machek didn't like this at all. He said that there wasn't any actual GNSS-specific code in Johan's GNSS subsystem. There were a number of GPS devices that wouldn't work with Johan's code. And, Pavel felt that at best Johan's patch was a general power management system for serial devices. He felt it should not use names (like "GNSS") that then would be unavailable for a "real" GNSS subsystem that might be written in the future.
However, in kernel development, "good enough" tends to trump "good but not implemented". Johan acknowledged that his code didn't support all GPS devices, but he said that many were proprietary devices using proprietary interfaces, and those companies could submit their own patches. Also, Johan had included two GPS drivers in his patch, indicating that even though his subsystem might not contain GNSS-specific code, it was still useful for its intended purpose—regularizing the GPS device interface.
The debate went back and forth for a while. Pavel seemed to have the ultimate truth on his side—that Johan's code was at best misnamed, and at worst, incomplete and badly structured. Although Johan had real-world usefulness on his side, where something like his patch had been requested by other developers for a long time and solved actual problems confronted by people today.
Finally Greg Kroah-Hartman put a stop to all debate—at least for the moment—by simply accepting the patch and feeding it up to Linus Torvalds for inclusion in the main kernel source tree. He essentially said that there was no competing patch being offered by anyone, so Johan's patch would do until anything better came along.
Pavel didn't want to give up so quickly, and he tried at least to negotiate a name change away from "GNSS", so that a "real" GNSS subsystem might still come along without a conflict. But with his new-found official support, Johan said, "This is the real gnss subsystem. Get over it."
It's an odd situation. On the other hand, the Linux kernel generally avoids trying to stake out territory for infrastructure that doesn't yet exist. It may be that Johan's non-GNSS GNSS subsystem will be all that's ever needed for GPS device support. In which case, why assume it will ever be more complicated than this? Famous last words.
Note: if you're mentioned in this article and want to send a response, please send a message with your response text to ljeditor@linuxjournal.com and we'll run it in the next Letters section and post it on the website as an addendum to the original article.
