In this video I discuss Ubuntu's decision to switch to using rust implementations of the core utilities (mkdir, ls, cat, etc...) and what it could mean for the broader Linux ecosystem. My merch is ...
Code written in Rust has been shown to have significantly fewer security vulnerabilities than code written in C. Distributions like Ubuntu ship a lot of security updates, so by switching to Rust-based utils, they can reduce their workload in the long run.
Outside of security you have some very really world benefits, like performance gains in various scenarios as well as lots more people willing to contribute and a much better type system (more maintainability).
Rust is better for writing multithreaded applications which means that the small amount of utilities that can utilize parallelism receive a significant speedup. uutils multithreaded sort was apparently 6x faster than the GNU utils single threaded version.
P.S. I strongly doubt handwritten assembly is more efficient than modern C compilers.
Compilers have a lot of chalenges to even compile, let alone optimize. Just register allocation alone is a big problem. An inherent problem is that the compiler does not know what the program is supposed to do. Humans still write better assembly then compilers.
The one down arrow on the guy you are responding to is from me, just so everybody knows.
This is just wrong, the compiler (and linker) knows exactly what the program does as it has the ENTIRE source code available. Compilers have been so good the last 20 years that it is quite hard to write things faster in assembly/machine code.
One of the harder parts about assembly is keeping track of which registers a subroutine uses and which one is available, as the program grows larger you might be forced to push/pop to the stack all the time.
Inlining code is also difficult in assembler, the compiler is quite adept at that.
It might have been true up until the 90s, but then compilers started getting so good (Watcom) there was rarely any point to write assembler code, unless there was some extremely hardware specific thing that needed to be done
Look, I wrote plenty of assembly. A human knows how the code will flow. A compiler knows how everything is linked together, but it does not know how exactly the code will flow. In higher level languages, like C, we don’t always think about things like what branch is more likely (often many times more likely).
Memory is the real performance winner, and yes registers play a big role in that. While cache is more important it depends on data layout and how it is processed. That is practically the same in C and asm.
C compilers don’t even use every GP register on amd64. And you know exactly what you need when you go into some procedure.
And when you get called / call outside of your… object file in C (or C ABI), you have to:
“Functions preserve the registers rbx, rsp, rbp, r12, r13, r14, and r15; while rax, rdi, rsi, rdx, rcx, r8, r9, r10, r11 are scratch registers.”
put those on the stack. So libraries have calling overhead (granted there is LTO).
In assembly you can even use the SSE registers as your scratchpad, pulling and putting arbitrary data in them (even pointers). The compilers never do that. (SSE registers can hold much more then GP)
In asm you have to know exactly how memory is handled, while C is a bit abstracted.
If you want to propagate such claims, read the “Hellо, I am a compiler” poorly informed… poem ?
But it’s easy to see how much a compiler doesn’t optimize by comparing compilers and compiler flags. GCC vs LLVM, O3 vs Os and even O2. What performs best is random, LLVM Os could be the fastest depending on the program. Differences are over 10% sometimes.
Biggest problem with writing in asm is that you have to plan a lot. It’s annoying, so that’s why I write higher level languages now.
Edit: Oh, I didn’t talk about instructions not in C, nor the FLAGS register.
Of course, for hot paths or small examples it is, but I doubt it’s feasible or maintainable to write a “real” projects like core utilities in assembly.
P.S. I strongly doubt handwritten assembly is more efficient than modern C compilers.
As with everything, it all depends.
When writing super efficient assembly you write towards the destination and not necessarily to fit higher level language constructs. There are often ways to cut corners for aspects not needed, reduction in instructions and loops all based on well designed assembly.
The problem is you aren’t going to do that for every single CPU instruction because it would take forever and not provide a good ROI. It is far more common to write 99% of your system code in C and then write just the parts that can really benefit from fine tuned assembly. And please note that unless you’re writing for an RTOS or something crazy critical on efficiency, its going to be even less assembly.
Is there any actual benefit ?
Well the rust project is MIT licensed, so definitely not.
Code written in Rust has been shown to have significantly fewer security vulnerabilities than code written in C. Distributions like Ubuntu ship a lot of security updates, so by switching to Rust-based utils, they can reduce their workload in the long run.
After introducing the Pro I don’t think so.
There’s probably some zero day exploit someone is holding onto until everything is rust and then, bam!. Yeah, that’s just silly to think. Just silly.
Just security and hype afaik.
No, it isn’t just hype. The hype is justified.
Outside of security you have some very really world benefits, like performance gains in various scenarios as well as lots more people willing to contribute and a much better type system (more maintainability).
It’s been proven faster. That’s all I personally know.
Nothing except for binary coding can be faster than C I think.
Rust is better for writing multithreaded applications which means that the small amount of utilities that can utilize parallelism receive a significant speedup. uutils multithreaded sort was apparently 6x faster than the GNU utils single threaded version.
P.S. I strongly doubt handwritten assembly is more efficient than modern C compilers.
My simple assembly program can rum circles around compilers. As long as something is small it is possible to optimize better than a C compiler.
Compilers have a lot of chalenges to even compile, let alone optimize. Just register allocation alone is a big problem. An inherent problem is that the compiler does not know what the program is supposed to do. Humans still write better assembly then compilers.
The one down arrow on the guy you are responding to is from me, just so everybody knows.
This is just wrong, the compiler (and linker) knows exactly what the program does as it has the ENTIRE source code available. Compilers have been so good the last 20 years that it is quite hard to write things faster in assembly/machine code.
One of the harder parts about assembly is keeping track of which registers a subroutine uses and which one is available, as the program grows larger you might be forced to push/pop to the stack all the time. Inlining code is also difficult in assembler, the compiler is quite adept at that.
It might have been true up until the 90s, but then compilers started getting so good (Watcom) there was rarely any point to write assembler code, unless there was some extremely hardware specific thing that needed to be done
Look, I wrote plenty of assembly. A human knows how the code will flow. A compiler knows how everything is linked together, but it does not know how exactly the code will flow. In higher level languages, like C, we don’t always think about things like what branch is more likely (often many times more likely).
Memory is the real performance winner, and yes registers play a big role in that. While cache is more important it depends on data layout and how it is processed. That is practically the same in C and asm.
C compilers don’t even use every GP register on amd64. And you know exactly what you need when you go into some procedure. And when you get called / call outside of your… object file in C (or C ABI), you have to: “Functions preserve the registers rbx, rsp, rbp, r12, r13, r14, and r15; while rax, rdi, rsi, rdx, rcx, r8, r9, r10, r11 are scratch registers.” put those on the stack. So libraries have calling overhead (granted there is LTO). In assembly you can even use the SSE registers as your scratchpad, pulling and putting arbitrary data in them (even pointers). The compilers never do that. (SSE registers can hold much more then GP)
In asm you have to know exactly how memory is handled, while C is a bit abstracted.
If you want to propagate such claims, read the “Hellо, I am a compiler” poorly informed… poem ? But it’s easy to see how much a compiler doesn’t optimize by comparing compilers and compiler flags. GCC vs LLVM, O3 vs Os and even O2. What performs best is random, LLVM Os could be the fastest depending on the program. Differences are over 10% sometimes.
Biggest problem with writing in asm is that you have to plan a lot. It’s annoying, so that’s why I write higher level languages now.
Edit: Oh, I didn’t talk about instructions not in C, nor the FLAGS register.
The last time I wrote assembly I was making a make shift sound card thing with an Arduino. I hooked a speaker up to the GPIO and was toggling the bit
In large applications maybe not, but in benchmarks there can be a perfectly optimized assembly
Of course, for hot paths or small examples it is, but I doubt it’s feasible or maintainable to write a “real” projects like core utilities in assembly.
Everyone knows you can do Roller Coaster Tycoon at most, no way you could do core utilities.
I’m sure you can do
cat
in assemblyAs with everything, it all depends.
When writing super efficient assembly you write towards the destination and not necessarily to fit higher level language constructs. There are often ways to cut corners for aspects not needed, reduction in instructions and loops all based on well designed assembly.
The problem is you aren’t going to do that for every single CPU instruction because it would take forever and not provide a good ROI. It is far more common to write 99% of your system code in C and then write just the parts that can really benefit from fine tuned assembly. And please note that unless you’re writing for an RTOS or something crazy critical on efficiency, its going to be even less assembly.
Multithreading isn’t a true efficiency benefit. I was talking about different things there.
Fortran
I’m not sure why people are downvoting you, since Fortran is known to be extremely performant when dealing with multidimensional arrays.