There will always be a subset of users whose goal is to not use your service, but to arbitrage your service into the maximum value for themselves.
For example -- let's say you offer $100 in free AWS credits by signing up to your platform. Expect a malicious user to eventually come to your platform, realize they can resell those $100 in credits for $50, and start using your platform for their own gain. Unless the mechanisms you add in place to reduce fraud / second sign ups / etc is greater than the value that they are receiving ($50), they will continue.
With sites where the platform is free, the math almost always makes sense for these malicious users to eventually abuse. In this case it was leveraging the email reputation of another domain at no cost to their own (along with the added value of anyone getting phished), but on other sites it's public profiles being used for backlinks / spam, etc.
You're mixing up bonus abuse here. The people behind phishing are more like hackers, whereas bonus abuse is usually run by non-technical people or bot farms. Scammers are much more dangerous, because they're typically behind operations far wider than just phishing, this might include actual financial fraud, international money laundering, and so on.
Bonus abuse is a small shop, whereas phishing through third-party services is much more likely to be an organized crime group.
To the end platform, what's the difference? Mitigation techniques largely remain the same, in that you make it more time / energy / money than what the end result of their abuse is worth. The platform cares about stopping the abuse -- not neccesarily correctly identifying whether the people abusing their platform are small shop "bot farms" vs organized crime.
To the platform, the difference shows up exactly in the mitigation math. The 'make it cost more than it's worth' model only works when both sides of that ratio are knowable and bounded. With bonus abuse, the reward is fixed and the math is clear, so you can reliably price the abuser out.
With organized criminals, you can't actually see what the abuse is 'worth' to them. And they can escalate almost infinitely: mimicking real user behavior, routing through residential IP proxies, using email addresses with established reputation, and at the top of the pyramid we've seen full mimics with real social network profiles and activity, they even answer phone calls.
That's why it's worth collecting events before acting: what the account is about, which IP network they use, whether they fake devices, whether there's any warmup prior to registration. Because that's what helps estimate whether your mitigation will actually work, and lets you respond in a balanced manner instead of under- or over-reacting.
Thought the same. And also, this whole piece might just be a nicely obfuscated ad for the service in question.
Plus the "guardian sales angels" in the comment threads protecting their product:
"why, I thought it read really smoothly?"
"how sad of you to see only the bad things in life"
"need to get outside more?"
What has this all become. What have I become.
The sentence construction, choice of vocabulary, and continually breathless tone are all clear indicators this was written by an llm and barely edited.
I threw part of it into pangram to get a second opinion:
Have you tried putting known human writing into pangram? I have. I've gotten 100% AI with multiple samples of my own human writing. It has also given me 50% on things I know were 100% AI written (from my prompts).
Pangram and everything like it is useless. The results are random on known samples.
Pangram specifically (as opposed to most other detectors) publish internal audits, and seem to welcome external audits [0]. I'm not saying that you are necessarily wrong, just that in my opinion they have earned a higher bar of criticism than random one off anecdote.
That's a fair criticism, I certainly didn't run a full benchmark. Just a few of my own pieces of writing. I also did it a few months ago, maybe it's gotten better since.
GP is right. This paragraph is a major tell. If you have read enough ChatGPT output that hasn’t been humanized, you start to instantly notice it:
“ The attacker hadn’t broken into anything. They’d just noticed something I hadn’t: I had a verified email-sending domain attached to open, unverified signup, and that’s a useful primitive if you don’t care what you send.”
I would rather read an email the sender actually wrote even if it looked like your example, as opposed to AI-written. In that sense it is "better" to me.
That's interesting! I have tried to get false positives from pangram and failed, so I trusted it a bit more than any of the others, although I generally just rely on my own intuition. I am curious what your false positive samples looked like, if you're willing to share.
(I'm less interested in false negatives; I have successfully produced those myself.)
To each their own, I suppose. I personally enjoyed the style, and it definitely pre-dates LLMs. For example:
> You want to know the benefits of free trade? Food is cheaper. Food is cheaper! Clothes are cheaper. Steel is cheaper. Cars are cheaper. Phone service is cheaper. You feel me building a rhythm here? That's because I'm a speech writer - I know how to make a point. It lowers prices, it raises income. You see what I did with 'lowers' and 'raises' there? It's called the science of listener attention. We did repetition, we did floating opposites, and now you end with the one that's not like the others. Ready? Free trade stops wars. Heh, and that's it.
"There was no exploit. No vulnerability disclosure. No CVE for me to write. The attacker filled out my signup form 942 times, made 942 workspaces, sent 942 batches of about a hundred invitations each, and stopped. They used my tool exactly as designed. The design was just bad enough that the tool was good for phishing."
> What stuck with me wasn’t the scale, although 14,000 people getting a phishing email from a domain I own is bad. It was how mundane it was.
> There was no exploit. No vulnerability disclosure. No CVE for me to write. The attacker filled out my signup form 942 times, made 942 workspaces, sent 942 batches of about a hundred invitations each, and stopped. They used my tool exactly as designed. The design was just bad enough that the tool was good for phishing.
shrug it's getting really easy to spot the AI writing style. I find myself scanning for the obvious `tics' and almost always stop reading when I see them. It's too hard to focus on the content at that point for me.
Stop it. Someone writes grammatically correct prose with proper punctuation, and - not sorry - idiots like you label it AI. Where do you think LLMs learned to write?
Huh. I didn't assume it was LLM-generated. I liked the article. I appreciated that author cared about the 14K phish recipients as if they were proper users.
I will say, I've grown bored of folks complaining about AI generated content. But, to each their own. Good luck storming the castle.
"Disposable email domains blocked"
This one is really annoying as in practice, more and more services that become spammers or sell to what are basically spammers cannot be kept at arms length.
1. You are not alone, this happens at a large scale across the board with companies of all sizes.
2. More than likely the abuser did not do it manually, more than likely they automated it
3. As a thoughtful business one may have rolled out all the authentication features/gates if the business picks up, as a starter the safe idea could have been to put it behind any openly available OAuth provider
Spammers are relentless. I had something similar happen to me 25 years ago. And once you're found out to allow any sort of information relay you end up in spamming scripts and for decades automated scripts will be trying to send/relay through you using the same api, even if you block it.
You learn to not leave anything open to spammers AT ALL, to your product's detriment because once you're labeled a spammer in this way your product is dead.
I guess the author learned a hard lesson about preventing abuse of a web service, especially a service that is capable of sending emails.
I have a few small projects that I would love to serve publicly from my VPS. But I have put them behind strict logins (no signup) or put them in read-only mode, with (likely premature) rate limiting, fail2ban and cloudflare, for fear that a month of bandwidth gets used within minutes by an attacker. For the same reason, sometimes I only shared the source on github and let people deploy it themselves if they are interested.
I don't know what domain was used to send that crap but you should probably have an abuse contact listed at kaneo.app so that if people do discover issues from your service they have an easy way to get a hold of you.
I've been thinking of making an event platform like Partiful, but only for personal use because it's also the perfect platform for spam (send emails and texts to people with attacker-controller content).
If you have commits in the linux kernel, your open source code has certainly been used to murder people. Because it's in everything, including weapons systems.
Is this the new norm for trying to make software projects in the wild?
The 14000 sends over 3 hours (< 1/s) makes it sound more-than-human speed. E.g. automated.
Wondering if LLM-assisted vulnerability hunting will lead to the same gains in scale for bad actors wanting to find spammable channels in applications. The barrier to entry becomes so much greater because any small project, once found, can be wrung dry of all its trust signals by third parties
Abuse such as this wasn't uncommon before, email platforms with lax ratelimits have always been abused through their clients' unsecured infrastructure. The only difference in post-LLM world is the amount of platforms as well as clients popping up in this space with dubious code quality that may lead to more attacks as;
a) having an email-sending product typically meant you had a project with a lot of effort invested into it as well as knowledge
b) the models, tokens spent and review done differs in the world of vibecoding and there is a race to the bottom to produce, produce, produce. Quantity > quality
If you have a website somewhere with an unrestricted comment box, it gets spammed. That doesn't take a special AI, because for years there have been script kiddies scanning new domains, IP addresses on AWS, common wp-admin URLs, etc.
Captcha here will only harm your real users' experience and won't protect against this kind of abuse, since it comes from real scammers, not fully automated bots.
I've dealt with these and similar issues over the last 8 years, which led our team to develop a security tool 5 years ago that is now open-sourced.
There will always be a subset of users whose goal is to not use your service, but to arbitrage your service into the maximum value for themselves.
For example -- let's say you offer $100 in free AWS credits by signing up to your platform. Expect a malicious user to eventually come to your platform, realize they can resell those $100 in credits for $50, and start using your platform for their own gain. Unless the mechanisms you add in place to reduce fraud / second sign ups / etc is greater than the value that they are receiving ($50), they will continue.
With sites where the platform is free, the math almost always makes sense for these malicious users to eventually abuse. In this case it was leveraging the email reputation of another domain at no cost to their own (along with the added value of anyone getting phished), but on other sites it's public profiles being used for backlinks / spam, etc.
You're mixing up bonus abuse here. The people behind phishing are more like hackers, whereas bonus abuse is usually run by non-technical people or bot farms. Scammers are much more dangerous, because they're typically behind operations far wider than just phishing, this might include actual financial fraud, international money laundering, and so on.
Bonus abuse is a small shop, whereas phishing through third-party services is much more likely to be an organized crime group.
To the end platform, what's the difference? Mitigation techniques largely remain the same, in that you make it more time / energy / money than what the end result of their abuse is worth. The platform cares about stopping the abuse -- not neccesarily correctly identifying whether the people abusing their platform are small shop "bot farms" vs organized crime.
To the platform, the difference shows up exactly in the mitigation math. The 'make it cost more than it's worth' model only works when both sides of that ratio are knowable and bounded. With bonus abuse, the reward is fixed and the math is clear, so you can reliably price the abuser out.
With organized criminals, you can't actually see what the abuse is 'worth' to them. And they can escalate almost infinitely: mimicking real user behavior, routing through residential IP proxies, using email addresses with established reputation, and at the top of the pyramid we've seen full mimics with real social network profiles and activity, they even answer phone calls.
That's why it's worth collecting events before acting: what the account is about, which IP network they use, whether they fake devices, whether there's any warmup prior to registration. Because that's what helps estimate whether your mitigation will actually work, and lets you respond in a balanced manner instead of under- or over-reacting.
Please write your blog post yourself if you expect people to read it. The LLM output is very grating.
Thought the same. And also, this whole piece might just be a nicely obfuscated ad for the service in question. Plus the "guardian sales angels" in the comment threads protecting their product: "why, I thought it read really smoothly?" "how sad of you to see only the bad things in life" "need to get outside more?" What has this all become. What have I become.
Why do you think this is LLM-generated? Reads perfectly fine to me.
The sentence construction, choice of vocabulary, and continually breathless tone are all clear indicators this was written by an llm and barely edited.
I threw part of it into pangram to get a second opinion:
https://www.pangram.com/history/8d6a7de3-86ac-4ce0-86c5-4f93...
Have you tried putting known human writing into pangram? I have. I've gotten 100% AI with multiple samples of my own human writing. It has also given me 50% on things I know were 100% AI written (from my prompts).
Pangram and everything like it is useless. The results are random on known samples.
Pangram specifically (as opposed to most other detectors) publish internal audits, and seem to welcome external audits [0]. I'm not saying that you are necessarily wrong, just that in my opinion they have earned a higher bar of criticism than random one off anecdote.
[0] https://xcancel.com/JohnHolbein1/status/2059648132250570975#...
That's a fair criticism, I certainly didn't run a full benchmark. Just a few of my own pieces of writing. I also did it a few months ago, maybe it's gotten better since.
GP is right. This paragraph is a major tell. If you have read enough ChatGPT output that hasn’t been humanized, you start to instantly notice it:
“ The attacker hadn’t broken into anything. They’d just noticed something I hadn’t: I had a verified email-sending domain attached to open, unverified signup, and that’s a useful primitive if you don’t care what you send.”
Yes, instead he should have written: "The attacka didn broke into nuffin bruv. They jus noticing sometin I aint, yeh?"
Instant betterness.
I would rather read an email the sender actually wrote even if it looked like your example, as opposed to AI-written. In that sense it is "better" to me.
That's interesting! I have tried to get false positives from pangram and failed, so I trusted it a bit more than any of the others, although I generally just rely on my own intuition. I am curious what your false positive samples looked like, if you're willing to share.
(I'm less interested in false negatives; I have successfully produced those myself.)
I'll try to pull them up for you, I'd have to go back and find them on my computer.
To each their own, I suppose. I personally enjoyed the style, and it definitely pre-dates LLMs. For example:
> You want to know the benefits of free trade? Food is cheaper. Food is cheaper! Clothes are cheaper. Steel is cheaper. Cars are cheaper. Phone service is cheaper. You feel me building a rhythm here? That's because I'm a speech writer - I know how to make a point. It lowers prices, it raises income. You see what I did with 'lowers' and 'raises' there? It's called the science of listener attention. We did repetition, we did floating opposites, and now you end with the one that's not like the others. Ready? Free trade stops wars. Heh, and that's it.
It sounds like this: https://youtu.be/8dGkiJcEK78?si=MGfv2FM_GksGoMho
The style demonstrated in the blog post is really not the same.
> There was no exploit. No vulnerability disclosure. No CVE for me to write.
was a dead giveaway in my mind when I read it.
That was the sentence that made me close the tab.
If it's not LLM-generated, it's really hard to read. The writing style is "obnoxious Linkedin post".
"There was no exploit. No vulnerability disclosure. No CVE for me to write. The attacker filled out my signup form 942 times, made 942 workspaces, sent 942 batches of about a hundred invitations each, and stopped. They used my tool exactly as designed. The design was just bad enough that the tool was good for phishing."
> What stuck with me wasn’t the scale, although 14,000 people getting a phishing email from a domain I own is bad. It was how mundane it was.
> There was no exploit. No vulnerability disclosure. No CVE for me to write. The attacker filled out my signup form 942 times, made 942 workspaces, sent 942 batches of about a hundred invitations each, and stopped. They used my tool exactly as designed. The design was just bad enough that the tool was good for phishing.
The comments continue until the patterns are internalized https://news.ycombinator.com/item?id=48316049
Dots and periods. Everywhere. So many. There is no paragraph — its sentences all the way down.
That made me think if the project is entirely vibecoded as well.
Even for a project manager without network access, hosting flawed software on your LAN can only get you so far.
shrug it's getting really easy to spot the AI writing style. I find myself scanning for the obvious `tics' and almost always stop reading when I see them. It's too hard to focus on the content at that point for me.
Stop it. Someone writes grammatically correct prose with proper punctuation, and - not sorry - idiots like you label it AI. Where do you think LLMs learned to write?
Huh. I didn't assume it was LLM-generated. I liked the article. I appreciated that author cared about the 14K phish recipients as if they were proper users.
I will say, I've grown bored of folks complaining about AI generated content. But, to each their own. Good luck storming the castle.
"Disposable email domains blocked" This one is really annoying as in practice, more and more services that become spammers or sell to what are basically spammers cannot be kept at arms length.
Couple thing:
1. You are not alone, this happens at a large scale across the board with companies of all sizes.
2. More than likely the abuser did not do it manually, more than likely they automated it
3. As a thoughtful business one may have rolled out all the authentication features/gates if the business picks up, as a starter the safe idea could have been to put it behind any openly available OAuth provider
Spammers are relentless. I had something similar happen to me 25 years ago. And once you're found out to allow any sort of information relay you end up in spamming scripts and for decades automated scripts will be trying to send/relay through you using the same api, even if you block it.
You learn to not leave anything open to spammers AT ALL, to your product's detriment because once you're labeled a spammer in this way your product is dead.
I guess the author learned a hard lesson about preventing abuse of a web service, especially a service that is capable of sending emails.
I have a few small projects that I would love to serve publicly from my VPS. But I have put them behind strict logins (no signup) or put them in read-only mode, with (likely premature) rate limiting, fail2ban and cloudflare, for fear that a month of bandwidth gets used within minutes by an attacker. For the same reason, sometimes I only shared the source on github and let people deploy it themselves if they are interested.
I don't know what domain was used to send that crap but you should probably have an abuse contact listed at kaneo.app so that if people do discover issues from your service they have an easy way to get a hold of you.
I've been thinking of making an event platform like Partiful, but only for personal use because it's also the perfect platform for spam (send emails and texts to people with attacker-controller content).
This kind of thing has happened to me.
I designed something that was "too open," and that "openness" was abused.
Sadly, spammers are why we can't have nice things; but that's been the case for decades. The incident I mentioned, happened in the 1990s.
The good news is, is that once this happens to you, you learn your lesson.
Sadly, the Internet is not a high trust society.
Please write your own blog posts rather than asking us to read LLM slop.
Just curious, on what grounds do you call this slop?
I thought it was a perfectly cromulent article making a perfectly reasonable point.
https://news.ycombinator.com/item?id=48326376
If you have commits in the linux kernel, your open source code has certainly been used to murder people. Because it's in everything, including weapons systems.
I think the problem here is that they used their server (that is running a demo of the project).
Wish I’d read a different example here, don’t even wanna subconsciously discourage any open source heroes
Is this the new norm for trying to make software projects in the wild?
The 14000 sends over 3 hours (< 1/s) makes it sound more-than-human speed. E.g. automated.
Wondering if LLM-assisted vulnerability hunting will lead to the same gains in scale for bad actors wanting to find spammable channels in applications. The barrier to entry becomes so much greater because any small project, once found, can be wrung dry of all its trust signals by third parties
Abuse such as this wasn't uncommon before, email platforms with lax ratelimits have always been abused through their clients' unsecured infrastructure. The only difference in post-LLM world is the amount of platforms as well as clients popping up in this space with dubious code quality that may lead to more attacks as;
a) having an email-sending product typically meant you had a project with a lot of effort invested into it as well as knowledge
b) the models, tokens spent and review done differs in the world of vibecoding and there is a race to the bottom to produce, produce, produce. Quantity > quality
If you have a website somewhere with an unrestricted comment box, it gets spammed. That doesn't take a special AI, because for years there have been script kiddies scanning new domains, IP addresses on AWS, common wp-admin URLs, etc.
Captcha here will only harm your real users' experience and won't protect against this kind of abuse, since it comes from real scammers, not fully automated bots.
I've dealt with these and similar issues over the last 8 years, which led our team to develop a security tool 5 years ago that is now open-sourced.
https://github.com/tirrenotechnologies/tirreno
Perhaps the downvotes come from scammers.