this post was submitted on 20 Sep 2023
5 points (100.0% liked)

Machine Learning

157 readers
1 users here now

Machine learning (ML) is a field devoted to understanding and building methods that let machines "learn" โ€“ that is, methods that leverage data to improve computer performance on some set of tasks. Machine learning algorithms build a model based on sample data, known as training data, in order to make predictions or decisions without being explicitly programmed to do so. Machine learning algorithms are used in a wide variety of applications, such as in medicine, email filtering, speech recognition, agriculture, and computer vision, where it is difficult or unfeasible to develop conventional algorithms to perform the needed tasks.

founded 1 year ago
 

It has long been established that predictive models can be transformed into lossless compressors and vice versa. Incidentally, in recent years, the machine learning community has focused on training increasingly large and powerful self-supervised (language) models. Since these large language models exhibit impressive predictive capabilities, they are well-positioned to be strong compressors. In this work, we advocate for viewing the prediction problem through the lens of compression and evaluate the compression capabilities of large (foundation) models. We show that large language models are powerful general-purpose predictors and that the compression viewpoint provides novel insights into scaling laws, tokenization, and in-context learning. For example, Chinchilla 70B, while trained primarily on text, compresses ImageNet patches to 43.4% and LibriSpeech samples to 16.4% of their raw size, beating domain-specific compressors like PNG (58.5%) or FLAC (30.3%), respectively. Finally, we show that the prediction-compression equivalence allows us to use any compressor (like gzip) to build a conditional generative model.

I wonder what a paper like this, especially given the title, does for the legal case regarding copyright and generative AI. Haven't had a chance to read the paper yet, so don't know if the findings are relevant to copyright.

you are viewing a single comment's thread
view the rest of the comments
[โ€“] KingRandomGuy 2 points 11 months ago

And there are relatively few people who design the image sensors for cameras compared the the number of people using a camera to take pictures. They're still designed as a tool by a person.

I'm not the most familiar with copyright law, but IIRC you're certainly able to violate copyright while taking a photo. If you take a photo of a copyrighted work (i.e. parts of a book or something) without artistic intent, I don't believe that's considered transformative.

I suspect the courts will end up having to deal with many of these issues on a case-by-case basis, just like they already do with fair use.