I'm looking at my Jetson Nano in the corner which is fulfilling its post-retirement role as a paper weight because Nvidia abandoned it in 4 years.
Nvidia Jetson Nano, A SBC for "AI" debuted with already aging custom Ubuntu 18.04 and when 18.04 went EOL, Nvidia abandoned it completely without any further updates to its proprietary jet-pack or drivers and without them all of Machine Learning stack like CUDA, Pytorch etc. became useless.
I'll never buy a SBC from Nvidia unless all the SW support is up-streamed to Linux kernel.
This is a very important point.
In general, Nvidia's relationship with Linux has been... complicated. On the one hand, at least they offer drivers for it. On the other, I have found few more reliable ways to irreparably break a Linux installation than trying to install or upgrade those drivers. They don't seem to prioritize it as a first class citizen, more just tolerate it the bare minimum required to claim it works.
For those unfamiliar with Linus Torvalds' two-word opinion of Nvidia:> Nvidia's relationship with Linux has been... complicated.
Wow. Torvalds' distaste for Nvidia from that, albeit 12 year old, clip leaves little to the imagination. Re: gaming GPUs, Windows is their main OS, but is that the main reason why Huang only mentioned Windows in his CES 2025 keynote? Their gaming chips are a small portion of the company now. But they want to focus dev on Windows??
Nvidia has its own Linux distribution, DGX OS, based on Ubuntu LTS, but installing other Linux distros on machines with Nvidia GPUs is less than ideal.
Now that the majority of their revenue is from data centers instead of Windows gaming PCs, you'd think their relationship with Linux should improve or already has.
Nvidia segments its big iron AI hardware from the consumer/prosumer segment. They do this by forbidding the use of GeForce drivers in datacenters[1]. All that to say, it is possible for the H100 to to have excellent Linux support, while support for the 4090 is awful.
1. https://www.datacenterdynamics.com/en/news/nvidia-updates-ge...
They have been making real improvements the last few years. Most of their proprietary driver code is in firmware now, and the kernel driver is open-source[1] (the userland-side is still closed though).
They've also significantly improved support for wayland and stopped trying to force eglstreams on the community. Wayland+nvidia works quite well now, especially after they added explicit sync support.
Because red hat announced that next RHEl is going tone Wayland only, that why they fixed all that you said, they don't care about users, only servers
>complicated
... as in remember the time a ransomware hacker outfit demanded they release the drivers or else .....
https://www.webpronews.com/open-source-drivers-or-else-nvidi...
It's possible. I haven't had a system completely destroyed by Nvidia in the last few years, but I've been assuming that's because I've gotten in the habit of just not touching it once I get it working...
I have been having a fine time with a 3080 on recent Arch, FWIW.
HDR support is still painful, but that seems to be a Linux problem, not specific to Nvidia.
I update drivers regularly. I've only had one display failure and was solved by a simple rollback. To be a bit fair (:/) it was specifically a combination of new beta driver and a newer kernel. It's definitely improved a ton since 10 years ago I just would not update them except very carefully.
I've bricked multiple systems just running apt install on the Nvidia drivers. I have no idea how, but I run the installation, everything works fine, and then when I reboot I can't even boot.
That was years ago, but it happened multiple times and I've been very cautious ever since.
Interesting. I've never had that issue (~15 years experience) but I always had CPUs with graphics drivers. Do you think that might be it? The danger zone was always at `startx` and never before. (I still buy CPUs with graphics drivers because I think it is always good to have a fallback and hey, sometimes I want to sacrifice graphics for GPU compute :)
I got similar experience. I really prefer switch CUDA version with whole PC machine. What is more, the speed and memory of hardware improves quickly in time as well.
The Digits device runs the same nVidia DGX OS (nVidia custom Ubuntu distro) that they run on their cloud infra.
I've had a similar experience, my Xavier NX stopped working after the last update and now it's just collecting dust. To be honest, I've found the Nvidia SBC to be more of a hassle than it's worth.
Xavier AGX owner here to report the same.
My Jetson TX2 developer kit didn't stop working, but it's on a very out of date Linux distribution.
Maybe if Nvidia makes it to four trillion in market cap they'll have enough spare change to keep these older boards properly supported, or at least upstream all the needed support.
Back in 2018 I've been involved in a product development based on TX2. I had to untangle the entire nasty mess of Bash and Python spaghetti that is JetPack SDK to get everything sensibly integrated into our custom firmware build system and workflow (no, copying your application files over prebaked rootfs on a running board is absolutely NOT how it's normally done). You basically need a few deb packages with nvidia libs for your userspace, and swipe a few binaries from Jetpack that have to be run with like 20 undocumented arguments in right order to do the rest (image assembly, flashing, signing, secure boot stuff, etc), the rest of the system could be anything. Right when I was finished, a 3rd party Yocto layer implementing essentially the same stuff that I came up with, and the world could finally forget about horrors of JetPack for good. I also heard that it has somewhat improved later on, but I have not touch any NVidia SoCs since (due to both trauma and moving to a different field).
Are you aware that mainline linux runs on these Jetson devices? It's a bit of annoying work, but you can be running ArchLinuxARM.
https://github.com/archlinuxarm/PKGBUILDs/pull/1580
Edit: It's been a while since I did this, but I had to manually build the kernel, overwrite a dtb file maybe (and Linux_for_Tegra/bootloader/l4t_initrd.img) and run something like this (for xavier)
sudo ./flash.sh -N 128.30.84.100:/srv/arch -K /home/aeden/out/Image -d /home/aeden/out/tegra194-p2972-0000.dtb jetson-xavier eth0
How close does any of that get a person to having Ubuntu 24.04 running on their board?
(I guess we can put aside the issue of Nvidia's closed source graphics drivers for the moment)
You could install Ubuntu 24.04 using debootstrap. That would just get you the user space, though, you'd still have to build your own kernel image.
Isn't the Jetson line more of an embedded line and not a end-user desktop? Why would you run Ubuntu?
Jetson are embedded devices that run ubuntu. Ubuntu is the OS it ships with.
The Jetson TX2 developer kit makes a very nice developer machine - an ARM64 machine with good graphics acceleration, CUDA, etc.
In any case, Ubuntu is what it comes with.
If you spent enough time and energy on it.. I'm fairly confident you could get the newest Ubuntu running. You'd have to build your own kernel, manually generate the initramfs, figure out how to and then flash it. You'd probably run into stupid little problems like the partition table the flash script makes doesn't allocate enough space for the kernel you've built.. I'm sure there would be hiccups, at the very least, but everything's out there to do it.
Wait, my AGX is still working, but I have kept it offline and away from updates. Do the updates kill it? Or is it a case of not supporting newer pytorch or something else you need?
Xavier AGX is awesome for running ESXi aarch64 edition, including aarch64 Windows vms
The Orin series and later use UEFI and you can apparently run upstream, non-GPU enabled kernels on them. There's a user guide page documenting it. So I think it's gotten a lot better, but it's sort of moot because the non-GPU thing is because the JetPack Linux fork has a specific 'nvgpu' driver used for Tegra devices that hasn't been unforked from that tree. So, you can buy better alternatives unless you're explicitly doing the robotics+AI inference edge stuff.
But the impression I get from this device is that it's closer in spirit to the Grace Hopper/datacenter designs than it is the Tegra designs, due to both the naming, design (DGX style) and the software (DGX OS?) which goes on their workstation/server designs. They are also UEFI, and in those scenarios, you can (I believe?) use the upstream Linux kernel with the open source nvidia driver using whatever distro you like. In that case, this would be a much more "familiar" machine with a much more ordinary Linux experience. But who knows. Maybe GH200/GB200 need custom patches, too.
Time will tell, but if this is a good GPU paired with a good ARM Cortex design, and it works more like a traditional Linux box than the Jeton series, it may be a great local AI inference machine.
AGX also has UEFI firmware which allows you to install ESXi. Then you can install any generic EFI arm64 iso in a VM with no problems, including windows.
It runs their dgx os and Jensen specifically said it would be a full part if their hw stack
If this is DGX OS, then yes, this is what you’ll find installed on their 4-cards workstations.
This is more like a micro-DGX then, for $3k.
And unless there is some expanded maintenance going on, 22.04 is EOL in 2 years. In my experience, vendors are not as on top of security patches as upstream. We will see, but given NVIDIA's closed ecosystem, I don't have high hopes that this will be supported long term.
Is there any recent, powerful SBC with fully upstream kernel support?
I can only think of raspberry pi...
rk3588 is pretty close, I believe it's usable today, just missing a few corner cases with HDMI or some such. I believe that last patches are either pending or already applied to an RC.
Radha but that’s n100 aka x64
The odroid H series. But that packs a x86 cpu.
If its stack still works, you might be able to sell or donate it to a student experimenting. They can still learn quite a few things with it. Maybe even use it for something.
Using outdated tensorflow (v1 from 2018) or outdated PyTorch makes learning harder than it need to be, considering most resources online use much newer versions of the frameworks. If you're learning the fundamentals and working from first principle and creating the building blocks yourself, then it adds to the experience. However, most most people just want to build different types of nets, and it's hard to do when the code won't work for you.
If you're expecting this device to stay relevant for 4 years you are not the target demographic.
Compute is evolving way too rapidly to be setting-and-forgetting anything at the moment.
Today I'm using 2x 3090's which are over 4 years old at this point and still very usable. To get 48gb vram I would need 3x 5070ti - still over $2k.
In 4 years, you'll be able to combine 2 of these to get 256gb unified memory. I expect that to have many uses and still be in a favorable form factor and price.
Eh? By all indications compute is now evolving SLOWER than ever. Moore's Law is dead, Dennard scaling is over, the latest fab nodes are evolutionary rather than revolutionary.
This isn't the 80s when compute doubled every 9 months, mostly on clock scaling.
Indeed, generational improvements are at an all time low. Most of the "revolutionary" AI and/or GPU improvements are less precision (fp32 -> fp16 -> fp8 -> fp4) or adding ever more fake pixels, fake frames, and now in the most recent iteration multiple fake frames per computed frame.
I believe Nvidia has some published numbers for the 5000 series that showed DLSS off performance, which allowed a fair comparison to the previous generation, on the order of 25%, then removed it.
Thankfully the 3rd party benchmarks that use the same settings on old and new hardware should be out soon.
Fab node size is not the only factor in performance. Physical limits were reached, and we're pulling back from the extremely small stuff for the time being. That is the evolutionary part.
Revolutionary developments are: multi-layer wafer bonding, chiplets (collections of interconnected wafers) and backside power delivery. We don't need the transistors to keep getting physically smaller, we need more of them, and at increased efficiency, and that's exactly what's happening.
All that comes with linear increases of heat, and exponential difficulty of heat dissipation (square-cube law).
There is still progress being made in hardware, but for most critical components it's looking far more logarithmic now as we're approaching the physical material limits.