• 1.55K Posts
  • 97 Comments
Joined 1 year ago
cake
Cake day: June 9th, 2023

help-circle




















  • Three side remarks about China, which can be a peculiar example to compare to for Russia, maybe even any other country:

    • They actually banned consoles for a quite significant 15 years (2000–2015), which strongly tilted their market towards PC.
    • Their companies actively make PC-type gaming handhelds, and many of them are even well-established in the business ahead the current “Steam Deck” wave/bandwagon: GPD (once called GamePad Digital, first release in 2016), OneXPlayer (2020), Ayaneo (2021).
    • Chinese gaming companies are quite at the whim of the censorship, and occasional “crackdowns” out of the blue, and many have therefore reoriented themselves for an international audience to de-risk their business.





  • How does this analogy work at all? LoRA is chosen by the modifier to be low ranked to accommodate some desktop/workstation memory constraint, not because the other weights are “very hard” to modify if you happens to have the necessary compute and I/O. The development in LoRA is also largely directed by storage reduction (hence not too many layers modified) and preservation of the generalizability (since training generalizable models is hard). The Kronecker product versions, in particular, has been first developed in the context of federated learning, and not for desktop/workstation fine-tuning (also LoRA is fully capable of modifying all weights, it is rather a technique to do it in a correlated fashion to reduce the size of the gradient update). And much development of LoRA happened in the context of otherwise fully open datasets (e.g. LAION), that are just not manageable in desktop/workstation settings.

    This narrow perspective of “source” is taking away the actual usefulness of compute/training here. Datasets from e.g. LAION to Common Crawl have been available for some time, along with training code (sometimes independently reproduced) for the Imagen diffusion model or GPT. It is only when e.g. GPT-J came along that somebody invested into the compute (including how to scale it to their specific cluster) that the result became useful.


  • This is a very shallow analogy. Fine-tuning is rather the standard technical approach to reduce compute, even if you have access to the code and all training data. Hence there has always been a rich and established ecosystem for fine-tuning, regardless of “source.” Patching closed-source binaries is not the standard approach, since compilation is far less computational intensive than today’s large scale training.

    Java byte codes are a far fetched example. JVM does assume a specific architecture that is particular to the CPU-dominant world when it was developed, and Java byte codes cannot be trivially executed (efficiently) on a GPU or FPGA, for instance.

    And by the way, the issue of weight portability is far more relevant than the forced comparison to (simple) code can accomplish. Usually today’s large scale training code is very unique to a particular cluster (or TPU, WSE), as opposed to the resulting weight. Even if you got hold of somebody’s training code, you often have to reinvent the wheel to scale it to your own particular compute hardware, interconnect, I/O pipeline, etc… This is not commodity open source on your home PC or workstation.


  • The situation is somewhat different and nuanced. With weights there are tools for fine-tuning, LoRA/LoHa, PEFT, etc., which presents a different situation as with binaries for programs. You can see that despite e.g. LLaMA being “compiled”, others can significantly use it to make models that surpass the previous iteration (see e.g. recently WizardLM 2 in relation to LLaMA 2). Weights are also to a much larger degree architecturally independent than binaries (you can usually cross train/inference on GPU, Google TPU, Cerebras WSE, etc. with the same weights).


  • Unless Valve can either find or pay a company that does a custom packaging of a Nvidia GPU with x86 (like the Intel Kaby Lake-G SoC with an in-package Radeon), very unlikely. The handheld size makes an “out of package” discrete GPU very difficult.

    And making Nvidia themselves warm up to x86 is just unrealistic at this point. Even if e.g. Nintendo demanded, the entire gaming market — see AMD’s anemic recent 2024Q1 result from gaming vs. data center and AI — is unlikely to be compelling enough for Nvidia to be interested in x86 development, vs. continuing with their ARM-based Grace “superchip.”







  • He was criticized also because the girls were not in danger of becoming infected. See e.g. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6724388/ :

    The Chinese episode has also generated other issues. Several notes demonstrate that this was an experiment and not a therapeutic intervention (even He Jiankui called it a ‘clinical trial’). The babies were not at risk of being born with HIV, given that sperm washing had been used so that only non-infected genetic material was used. Further, even though one of the parents (or both) was infected, it did not mean the children were more prone to becoming infected. The risk of becoming infected by the parents’ virus was very low (Cowgill et al., 2008). In sum, there was no curative purpose, nor even the intention to prevent a pressing risk. Finally, the interventions were different for each twin. In one case, the two copies of CCR5 were modified, whereas in the other only one copy was modified. This meant that one twin could still become infected, although the evolution of the disease would probably be slower. The purpose of the scientific team was apparently to monitor the evolution of both babies and the differences in how they reacted to their different genetic modifications. This note also raised the issue of parents’ informed consent regarding human experimentation, which follows a much stricter regimen than consent for therapeutic procedures.

    Other critical articles (e.g. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8524470/) have also cited in particular https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4779710/, which states in the result section:

    No HIV transmission occurred in 11,585 cycles of assisted reproduction using washed semen among 3,994 women (95% confidence interval [CI] = 0–0.0001). Among the subset of HIV-infected men without plasma viral suppression at the time of semen washing, no HIV seroconversions occurred among 1,023 women following 2,863 cycles of assisted reproduction using washed semen (95%CI= 0–0.0006). Studies that measured HIV transmission to infants reported no cases of vertical transmission (0/1,026, 95% CI= 0–0.0029). Overall, 56.3% (2,357/4,184, 95%CI=54.8%–57.8%) of couples achieved a clinical pregnancy using washed semen.





  • From my own statistics how many I feel worthy posting/linking on Lemmy, the most direct alternative to Kotaku is Eurogamer. PCGamer, PCGamesN and Rock Paper Shotgun are occasionally OK, but you have to cut through a lot of spam and clickbait (i.e. exactly this “50 guides per week” type of corporate guidance). Not sure if this is also the state that Kotaku will end up in. The Verge sometimes also have good articles, but the flood of gadget consumerism articles there is obnoxious.


  • In other words, there may be downsides just to placing CS within an engineering school, let alone making it an independent college. Left entirely to themselves, computer scientists can forget that computers are supposed to be tools that help people. Georgia Tech’s College of Computing worked “because the culture was always outward-looking. We sought to use computing to solve others’ problems,” Guzdial said. But that may have been a momentary success. Now, at Michigan, he is trying to rebuild computing education from scratch, for students in fields such as French and sociology. He wants them to understand it as a means of self-expression or achieving justice—and not just a way of making software, or money.

    Early in my undergraduate career, I decided to abandon CS as a major. Even as an undergraduate, I already had a side job in what would become the internet industry, and computer science, as an academic field, felt theoretical and unnecessary. Reasoning that I could easily get a job as a computer professional no matter what it said on my degree, I decided to study other things while I had the chance.

    I have a strong memory of processing the paperwork to drop my computer-science major in college, in favor of philosophy. I walked down a quiet, blue-tiled hallway of the engineering building. All the faculty doors were closed, although the click-click of mechanical keyboards could be heard behind many of them. I knocked on my adviser’s door; she opened it, silently signed my paperwork without inviting me in, and closed the door again. The keyboard tapping resumed. The whole experience was a product of its time, when computer science was a field composed of oddball characters, working by themselves, and largely disconnected from what was happening in the world at large. Almost 30 years later, their projects have turned into the infrastructure of our daily lives. Want to find a job? That’s LinkedIn. Keep in touch? Gmail, or Instagram. Get news? A website like this one, we hope, but perhaps TikTok. My university uses a software service sold by a tech company to run its courses. Some things have been made easier with computing. Others have been changed to serve another end, like scaling up an online business.

    The struggle to figure out the best organizational structure for computing education is, in a way, a microcosm of the struggle under way in the computing sector at large. For decades, computers were tools used to accomplish tasks better and more efficiently. Then computing became the way we work and live. It became our culture, and we began doing what computers made possible, rather than using computers to solve problems defined outside their purview. Tech moguls became famous, wealthy, and powerful. So did CS academics (relatively speaking). The success of the latter—in terms of rising student enrollments, research output, and fundraising dollars—both sustains and justifies their growing influence on campus.

    If computing colleges have erred, it may be in failing to exert their power with even greater zeal. For all their talk of growth and expansion within academia, the computing deans’ ambitions seem remarkably modest. Martial Hebert, the dean of Carnegie Mellon’s computing school, almost sounded like he was talking about the liberal arts when he told me that CS is “a rich tapestry of disciplines” that “goes far beyond computers and coding.” But the seven departments in his school correspond to the traditional, core aspects of computing plus computational biology. They do not include history, for example, or finance. Bala and Isbell talked about incorporating law, policy, and psychology into their programs of study, but only in the form of hiring individual professors into more traditional CS divisions. None of the deans I spoke with aspires to launch, say, a department of art within their college of computing, or one of politics, sociology, or film. Their vision does not reflect the idea that computing can or should be a superordinate realm of scholarship, on the order of the arts or engineering. Rather, they are proceeding as though it were a technical school for producing a certain variety of very well-paid professionals. A computing college deserving of the name wouldn’t just provide deeper coursework in CS and its closely adjacent fields; it would expand and reinvent other, seemingly remote disciplines for the age of computation.

    Near the end of our conversation, Isbell mentioned the engineering fallacy, which he summarized like this: Someone asks you to solve a problem, and you solve it without asking if it’s a problem worth solving. I used to think computing education might be stuck in a nesting-doll version of the engineer’s fallacy, in which CS departments have been asked to train more software engineers without considering whether more software engineers are really what the world needs. Now I worry that they have a bigger problem to address: how to make computer people care about everything else as much as they care about computers.

    Ian Bogost is a contributing writer at The Atlantic.