@o11c

o11c@programming.dev · 1 year ago

For one thing: don’t bother with fancy log destinations. Just log to stderr and let your daemon manager take care of directing that where it needs to go. (systemd made life a lot easier in the Linux world).

Structured logging is overrated since it means you can’t just do the above.

Per-module (filterable) logging are quite useful, but must be automatic (use __FILE__ or __name__ whatever your language supports) or you will never actually do it. All semi-reasonable languages support some form of either macros-which-capture-the-current-module-and-location or peek-at-the-caller-module-name-and-location.

One subtle part of logging: never conditionally defer a computation that can fail. Many logging APIs ultimately support something like:

if (log_level >= INFO) // or &lt;= depending on how levels are numbered
    do_log(INFO, message, arguments...)

This is potentially dangerous - if logging of that level is disabled, the code is never tested, and trying to enable logging later might introduce an error when evaluating the arguments or formatting them into the message. Also, if logging of that level is disabled, side-effects might not happen.

To avoid this, do one of:

never use the if-style deferring, internally or externally. Instead, squelch the I/O only. This can have a significant performance cost (especially at the DEBUG level), which is why the API is made in the first place.
ensure that your type system can statically verify that runtime errors are impossible in the conditional block. This requires that you are using a sane language and logging library.
run your testsuite at every log level, ensure 100% coverage of log code, and hope that the inevitable logic bug doesn’t have an unexpected dynamic failure.

o11c@programming.dev · 1 year ago

To be fair, Secure Boot is actively hostile toward dual-booting in the first place. Worst of all, it might seem to work for a while then suddenly start causing errors sometime later.

o11c@programming.dev · 1 year ago

From my experience, Cinnamon is definitely highly immature compared to KDE. Very poor support for virtual desktops is the thing that jumped out at me most. There were also some problems regarding shortcuts and/or keyboard layout I think, and probably others, but I only played with it for a couple weeks while limited to LiveCD.

o11c@programming.dev · 1 year ago

ReplaceFile exists to get everyone else’s semantics though?

o11c@programming.dev · 1 year ago

Related, note that division is much slower than multiplication.

Instead of:

n / d

see if you can refactor it to:

n * (1.0/d)

where that inverse can then be hoisted out of loops.

o11c@programming.dev · 1 year ago

This is about the one thing where SQL is a badly designed language, and you should use a frontend that forces you to write your queries in the order (table, filter, columns) for consistency.

UPDATE table_name WHERE y = $3 SET w = $1, x = $2, z = $4 RETURNING *
FROM table_name SELECT w, x, y, z

o11c@programming.dev · 1 year ago

Obviously the actual programs are trivial. The question is, how are the tools supposed to be used?

So you say to use deno? Out of all the tutorials I found telling me what tools to use, that wasn’t one of them (I really thought this “typescript” package would be the thing I was supposed to use; I just checked again on a hot cache and it was 1.7 seconds real time, 4.5 seconds cpu time, only 2.9 seconds if I pin everything to a single core). And I swear I just saw this week, people saying “seriously, don’t use deno”. It also doesn’t seem to address the browser use case at all though.

In other languages I know, I know how to write 4 files (the fib library and 3 frontends), and compile and/or execute them separately. I know how to shove all of them into a single blob with multiple entry points selected dynamically. I know how to shove just one frontend with the library into a single executable. I know how to separately compile the library and each frontend, producing 4 separate artifacts, with the library being dynamically replaceable. I even know how to leave them as loose files and execute them directly (barring things like C). I can choose between these things all in a single codebase, since there are no hard-coded project filenames.

I learned these things because I knew I wanted the ability from previous languages I’d learned, and very quickly found how the new language’s tools supported that.

I don’t have that for TS (JS itself seems to be fine, since I have yet to actually need all the polyfill spam). And every time I try to find an answer, I get something that contradicts everything I read before.

That is why I say that TS is a hopelessly immature ecosystem.

o11c@programming.dev · 1 year ago

I’m not concerned about the Microsoft’s involvement. TypeScript shows an immature tooling ecosystem even on its own merits.

I posted some of my concerns earlier, along with a basic problem challenge (that I can easily do in many other languages) that nobody managed to solve: https://programming.dev/comment/2734178

o11c@programming.dev · 1 year ago

It’s because unicode was really broken, and a lot of the obvious breakage was when people mixed the two. So they did fix some of the obvious breakage, but they left a lot of the subtle breakage (in addition to breaking a lot of existing correct code, and introducing a completely nonsensical bytes class).

o11c@programming.dev · 1 year ago

I’ve only ever seen two parts of git that could arguably be called unintuitive, and they both got fixes:

git reset seems to do 2 unrelated things for some people. Nowadays git restore exists.
the inconsistent difference between a..b and a...b commit ranges in various commands. This is admittedly obscure enough that I would have to look up the manual half the time anyway.
I suppose we could call the fact that man git foo didn’t used to work unintuitive I guess.

The tooling to integrate git submodule into normal tree operations could be improved though. But nowadays there’s git subtree for all the people who want to do it wrong but easily.

The only reason people complain so much about git is that it’s the only VCS that’s actually widely used anymore. All the others have worse problems, but there’s nobody left to complain about them.

o11c@programming.dev · 1 year ago

Python 2 had one mostly-working str class, and a mostly-broken unicode class.

Python 3, for some reason, got rid of the one that mostly worked, leaving no replacement. The closest you can get is to spam surrogateescape everywhere, which is both incorrect and has significant performance cost - and that still leaves several APIs unavailable.

Simply removing str indexing would’ve fixed the common user mistake if that was really desirable. It’s not like unicode indexing is meaningful either, and now large amounts of historical data can no longer be accessed from Python.

o11c@programming.dev · 1 year ago

The problem with mailing lists is that no mailing list provider ever supports “subscribe to this message tree”.

As a result, either you get constant spam, or you don’t get half the replies.

o11c@programming.dev · edit-2 1 year ago

Unfortunately both of those are used in common English or computer words. The only letter pairs not used are: bq, bx, cf, cj, dx, fq, fx, fz, hx, jb, jc, jf, jg, jq, jv, jx, jz, kq, kz, mx, px, qc, qd, qg, qh, qj, qk, ql, qm, qn, qp, qq, qr, qt, qv, qx, qy, qz, sx, tx, vb, vc, vf, vj, vm, vq, vw, vx, wq, wx, xj, zx.

Personally I have mappings based on <CR>, and press it twice to get a real newline.

o11c@programming.dev · 1 year ago

The problem is that there’s a severe hole in the ABCs: there is no distinction between “container whose elements are mutable” and “container whose elements and size are mutable”.

(related, there’s no distinction for supporting slice operations or not, e.g. deque)

o11c@programming.dev · 1 year ago

Sure, there are libraries and tools for some parts. But the question is: do they actually add value, or do they subtract it due to the cost it imposes to integrate them? There are only a couple parts where it is likely worth it (but even there, you’ll need to understand what they’re doing): the UDP reliability layer, the encryption layer, and possibly the event loop (but I say that mostly because io_uring is weird; previously I wouldn’t include this. On the client side it isn’t needed, but that depends on how much code you share between server and client). Packet framing and serialization are really easy to do yourself and most existing tools (which usually do generate code anyway) have weird limitations or overhead.
“should do more” is a general rule to prevent easy DoS attacks. It can mean packet sizes, it can mean computation, it can mean hard-coded timer delays, it can mean all sorts of things. Don’t make it easy for a malicious client to waste shared resources. This is mostly relevant for early in the connection … but keep in mind that it’s possible for a malicious client to spy on a legitimate client and try to take over - that is, connections aren’t actually a real thing (this applies even for TCP).

o11c@programming.dev · 1 year ago

You’ve clearly thought about the problem, so the solutions should be relatively obvious. Some less obvious ones:

It is impossible to make TCP reliable no matter how hard you try, because anybody can inject an RST flag at any time and cut off your connections (this isn’t theoretical, it’s actually quite common for long-lived gaming connections). That leaves UDP, for which there are several reliability layers, but most of them are not battle-tested - remember, TCP is most notable for congestion-control! HTTP3 is probably the only viable choice at scale, but beware that many implementations are very bad (e.g. not even supporting recvmmsg/sendmmsg which are critical for performance unlike with TCP; note the extra m)
If you don’t encrypt all your packets, you will have random middleware mess with their data. Think at least a little about key rotation.
To avoid application-centric DoS, make sure the client always does “more” than the server; this extends to e.g. packet sizes.
Prefer to ultimately define things in data, not code (e.g. network packet layouts). Don’t be afraid to write several bespoke code-generators; many real-world serialization formats in particular have unacceptable tradeoffs. Make sure the core code doesn’t care about the details (e.g. make every packet physically variable-length even if logically it is always fixed-length; you can also normalize zero-padding at this level for future compatibility. I advise against delta-compression at this level because that’s extra processing you don’t need).
Make sure the client only has to connect to a single server. If you have multiple servers internally, have a thin bouncer/proxy that forwards packets appropriately. This also has benefits for the inevitable DDoS attacks.
Latency is a bitch and has far-ranging effects, though this is highly dependent on not just genre but also UI. For example “hold down a key to move continuously through the world” is problematic whereas “click to move to a location” is not.
Beware quadratic complexity, e.g. if every player must send a location update to every player.
Think not only about the database, but how to back up the database and how to roll back in case of catastrophe or exploit. An append-only flat file has a lot going for it; only periodic repacking is needed and you can keep the old version for a while with a guarantee that it’ll replay to identical state to the initial version of the new file. Of course, the best state is no state at all. You will need to consider the notion of “transaction” at many levels, including scripting (you must give me 20 bear asses for me to give), trading between players, etc.
You will have abuse in chat. You will also have cybersex. It’s possible to deal with this in a privacy-preserving way by merely signing chat, not logging it, so the player can present evidence only if they wish, but there are a lot of concerns about e.g. replays, selective message subsets, etc.
There will be bots, especially if the official client isn’t good enough.
It’s $CURRENTYEAR; write code for IPv6 exclusively. There are sockopts for transparently handling legacy IPv4 clients.
Client IP address is private information. It is also the only way to deal with certain kinds of abuse. Sometimes, you just have to block all of Poland.
Note that routing in parts of the world is really bad. Sometimes setting up your own dedicated connection chain between datacenters can improve performance by orders of magnitude, rather than letting clients use whatever their ISP says. If nesting proxies be sure to correctly validate IPs.
Life is simpler if internal stuff listens on a separate port than external stuff, but still verify your peer. IP whitelisting is useless except for localhost (which, mind, is all of 127.0.0.0/8 for IPv4 - about the only time IPv4 is actually useful rather than a mere mirage).

o11c@programming.dev · 1 year ago

True, speed does matter somewhat. But even if xterm isn’t the ultimate in speed, it’s pretty good. Starts up instantly (the benefit of no extraneous libraries); the worst question is if it’s occasionally limited to the framerate for certain output patterns, and if there’s a clog you can always minimize it for a moment.

o11c@programming.dev · 1 year ago

Speed is far from the only thing that matters in terminal emulators though. Correctness is critical.

The only terminals in which I have any confidence of correctness are xterm and pangoterm. And I suppose technically the BEL-for-ST extension is incorrect even there, but we have to live with that and a workaround is available.

A lot of terminal emulators end up hard-coding a handful of common sequences, and fail to correctly ignore sequences they don’t implement. And worse, many go on to implement sequences that cannot be correctly handled.

One simple example that usually fails: \e!!F. More nasty, however, are the ones that ignore intermediaries and execute some unrelated command instead.

I can’t be bothered to pick apart specific terminals anymore. Most don’t even know what an IR is.

o11c@programming.dev · 1 year ago

I guess I forgot to mention the other implicit difference in concerns:

When you are a game, you can reasonably assume: I have the user’s full focus and can take all the computing resources of their device, barring a few background apps.

When you are an application, the user will almost always have several other applications running to a meaningful degree, and those eat into available resources (often in a difficult-to-measure way). Unfortunately this rarely gets tested.

I’m not saying you can’t write an app using a game toolkit or vice versa, but you have to be aware of the differences and figure out how to configure it correctly for your use case.

(though actually - some purely-turn-based games that do nothing until user enters input do just fine on app toolkits. But the existence of such games means that game toolkits almost always support some way of supporting the app paradigm. By contrast, app toolkits often lack ready support for continuous game paradigms … unless you use APIs designed for video playback, often involving creating a separate child “window”. Actual video playback is really hard; even the makers of dedicated video-playing programs mess it up.)

o11c@programming.dev · 1 year ago

The problem with XCB is that it’s designed to be efficient, not easy. If you’re avoiding toolkits for some reason, “so what if I block the world” may be a reasonable tradeoff.