AI For The Few

AI stand for Angry Iguana [Image Source]

I am afraid to share anything of value on the internet nowadays, not that I ever had anything of value to share to begin with.

Seems like anything of value published without proper protection, licensing, and money to go to court if needed is just sucked up by a mass of web scrappers, bots, and patent trolls. (If you need a primer on patent trolls in tech: Video)

Not sure why they want my crap Java code, but they got it. I hope some students got prompted with one of my solutions to a simple interview question with an O(n) that would make even a React Native coder cry.

Then to put the cherry on top, some jackass copy-pastes it out of a f**ing AI client prompt and says it’s his and wins a contest.*

And the funny thing is, I don’t blame them! They have to compete with other people who are doing the same thing. People will gravitate to the least effort for the most output, and people desperately want to be successful and viewed as competent by other people.

It’s basically a law of human nature, and it’s easily exploitable.

I’m not saying this is entirely a bad thing either; access to information is one of the main pros of the internet and modern society.

There is a subtle difference with AI; it has a lot of value for a minority of people who want you to be dependent on their product and their data, and they don’t want to disclose where or who that data is from.

They took the stock market, housing market, energy market, and my grocery bill, so now they want to take PornHub away as well? (RIP Texas) Not so fast, though!

These same arguments have been made about search engines like Google in the past, saying Google is a monopoly on access to said information. In the case of Google, however, you have to click on a link that takes you to a website where the person who made that thing (at least in theory) could get credit for what they did, even though they often don’t.

Or if you have money, you go through a publishing company to ensure your name is stamped on the thing or patented before you release it into the wild.

Another counterargument is that AI could output these sources for said data at every prompt. In reality, I really don’t see that being the default output for every question that warrants it in ChatGPT, for example, unless you ask it to.

And even then, who controls what sources the LLM puts out? Do the owners of the AI dictate who gets the credit? Does the LLM favor a specific entity and only show you its content so it can only source it? How would you even know it’s not flat-out lying to you?

This is not foolproof at all; you could say that plagiarism makes this argument invalid even.

By then, why bother making citations and listing sources when writing a paper or article? Why use Waybackmachine to validate who or what was actually said and by whom, even if the website goes down? Excluding acts of kindness or donation, why patent or publish something if there is no way to actually claim you did it when someone uses data derived from it?

Plenty of people don’t do these things or don’t care, sadly, and it’s because they want you to believe it’s theirs and it benefits them. Even if they don’t realize this consciously.

Another point is that if you are using ChatGPT to try and give other people the illusion you created something for street cred or for profit, and you are aware of it, why would you want to know the sources or see the patent? That would mean your history of interacting with the LLM could hold you accountable for plagiarism or patent infringement.

Which honestly would be amazing. Please do that. With blockchain? Granted, that would break anonymity. IDK.

Why would a kid in school using an LLM do their homework want to see sources except when absolutely necessary for an essay that requires sources for a grade? It would literally be undeniable proof it’s plagiarized.

Why would an entrepreneur want you to know that their idea might be a patent infringement if no one could prove it in the first place? They wouldn’t be able to exploit it for those sweet gains.

There is a little bit of control for sources with the internet. In contrast, AI does not have to disclose anything trackable (intentionally, at least), and it’s in their best interest not to.

That may change. I have no clue.

Repeat after me: the people selling you a subscription to use their data for $100+ a month don’t want the credit to go anywhere else. Investors don’t want competition.

I’m an investor; stop trying to compete with Nvidia. I want to take your money hand over fists and give you no way to resist it.

The Real Question

I’m not against AI. It’s here, like it or not.

In contrast, my argument is: how is it possible to take so much licensed data from so many people and companies, then legally sell it as a “futuristic chatbot that will take your job with your own data” for those with deep pockets to invest in.

If it is actually going to become artificial general intelligence (AGI), it’s coming for your ass, too. It’s not just LLMs. There are other applications for AI that are just as scary but are not talked about in the news.

I’m shocked that people are okay with that narrative.

How are average people who don’t own vast amounts of Nvidia, Google, and Microsoft stock on board with this idea?

I don’t get it, not to mention all the ethical concerns with validating its outputs.

If I had to guess, it is because we are past the point of no return. The only thing that would stop AI development now is the loss of investor money because Nvidia stock enjoyers no longer believe AI has some way to turn a profit with other people’s data or sell GPUs to people who think they can turn a profit.

They have a massive moat, or at least they did. People who understand what is going on should understand the real problem and have a very obvious tangible advantage in exploiting it.

Cave Scenario Analysis

To make it as simple as possible. Here are some scenarios in caveman speak.

Take Cave Steve, for example, who was hired as the new financial quant. He is a lot cheaper than our last guy, who got poached by Goldman Sachs. He put together a cave painting presentation for our new investing strategy about AI. Here is the executive summary.

“First scenario paint I do for new job”

Student go to college or trade school
Student do homework, try hard
Student good grade
Student gets job NOT dependent on AI do job
Cave Corp AI less likely make money, low demand, no need.
AI no take good job

Result: “This no make us money”

“On other cave stone in hand. What Cave Corp Want”

Student go to college or trade school
AI do homework, student play WOW on cave stone box
Student good grade
Get job IS dependent on AI do job
Cave Corp AI more likely make money, high demand, and almost need for no skill student do job.
AI may eventually take job, or just less no skill human need for do job good enough for Cave Corp AI investor, layoff. Self-fulfilling prophecy as no skill student gives self to AI.

Result: “Make us most money”

“Last cave paint, sad, but light end of cave”

Human go college or trade school or self-study, no matter
Human do homework, practice project, learn and try hard at everything
Human good grade, learn new skill on own, believe in self
Get job NOT dependent on AI do job
But AI become too smart, no cave job, unlikely but maybe
But Human still have big brain for solve problem
Work with AI, make new job, help other caves do same
Fight AI from cave, like sci-fi cave paints, would be cool.

Result: “Break even”

Data Aggregation

Even though an AI model is not a database, AI can be thought of as an aggregation of vast amounts of data. In modern times, data is a resource with a price. You can derive actionable information from it, which directly translates to potential profit.

If this was not the case, why learn new skills and information that others don’t have? Why go to college or trade school to learn anything to get a better job for money? Why would companies drop billions into R&D with the intention of turning a profit?

AI is the embodiment of attempting to derive information from data and other information produced by other people. Albeit it still sucks at the ‘deriving’ bit, but that might change. That is how it’s trained, using resources. Then, using the correct algorithms from the model, it will regurgitate by guessing the expected tokens based on what it was trained on and your prompt.

For example, financial institutions have been doing this for years.

Collecting large amounts of data to find sequences of repeatable processes to buy resources at a discount or sell at a premium. This gives a disproportionate control of the resource, which translates to a profit.

If this wasn’t the case, there would be no need for nondisclosure agreements (NDAs) to keep others from knowing what you know or what you are doing. There would be no need for things like Data Science in the private sector at all. Why would you collect data that others may not have and analyze it, searching for profit or enhancing an existing product or service for more profit if this was not a ‘thing’?

Final Thoughts

This might sound like I am making an argument against free markets, but I am not.

Closed-source AI with no competitors due to expense in the control of a singular or handful of entities would be a way to hoard resources for profit.

I was relieved when the latest DeepSeek model was released open source (kind of), and at least for the moment, I have a little confidence that the average person has some leverage.

I don’t care about DeepSeek itself; what I really care about is that it represents more competitors having the ability to enter the AI market space with less funding.

The development of other models by ordinary people and smaller businesses, not FAANG or billionaire-driven funding, might be mission-critical in the future.

If you have to go to a singular place to get all of your information, you are dependent on that source for that information. And to reiterate, information is a resource with a price.

Try to run this stuff locally, do your own research, and understand how it actually works. Don’t just use ChatGPT and not learn new things. Trust in yourself, not a robot, the human brain is amazing because it can improve and learn with effort, not a data center and billions of dollars.

Use AI to aid you, not as a crutch or an excuse to tear yourself down.