A seller doesn't need to click a bad link to lose account access. All it takes is believing a message that looks like it came from the marketplace.
In Artovnia, buyer-to-seller communication runs through a built-in messaging module: buyers write to sellers via a contact form, sellers get an email notification and reply from within the admin panel. The form is available exclusively to registered and logged-in users — it's not an anonymous open channel. Even so, it turned out to be convenient enough for someone who had no intention of buying handmade goods. They wanted seller login credentials.
Response: stop it first, fix it second
No elegant solutions to start with. I commented out the line registering the module in medusa-config.ts:
// { resolve: './src/modules/messaging' },...and pushed it to production. The entire messaging module went offline before I'd fully understood the scale of the problem. At the same time, the attacker's account was banned and deleted the same day — there was no way to retry from that particular account. The sequence was intentional: taking down the channel always beats live analysis when there's a real risk that the next seller gets the same message and this time believes it. The cost of delayed buyer-to-seller response time was acceptable. The cost of a seller's login credentials being stolen — was not.
Only after disabling the module did I start working on ensuring that the same class of attack would have no way back — and the module was only going to come back online together with those defenses, not sooner.
An email was also sent to all sellers via the internal newsletter module, warning them about the incident and providing instructions on how to proceed.
It's not just a link filter
The simplest approach to phishing is blocking messages with links from outside the domain. The problem is that an effective social engineering attack often needs no link at all. It's enough to impersonate support, build time pressure, and ask for login credentials or a "payment verification" directly in the message body.
That's why validation in Artovnia looks at two independent signals:
- links from outside
artovnia.comand its subdomains, - language impersonating Artovnia support or administration, combined with a request for credentials, urgency, or a fake payment scenario.
Either signal alone is enough to block the message before it reaches the database or the seller's inbox.
Weighted scoring instead of a single trigger word
Blocking on a single word is fragile — it either lets through a cleverly worded attack or blocks an innocent customer message that happened to mention "bank" in a completely different context. Instead, every message receives a score composed of several signal categories: brand impersonation, time pressure, request for login credentials, financial data, attempts to move the conversation off-platform, fake institutions, and fake refund scenarios.
A simplified illustration of the mechanism — intentionally without the full list of phrases and weights, because that's one of those details not worth publishing verbatim:
type SignalCategory =
| 'impersonation'
| 'urgency'
| 'credentials'
| 'financial_data'
| 'off_platform_contact'
| 'fake_institution'
| 'fake_scenario'
function scoreMessage(subject: string, content: string): number {
const text = `${subject} ${content}`.toLowerCase()
return matchedPatterns(text).reduce(
(score, pattern) => score + pattern.weight,
0
)
}
function validateCustomerMessageSafety(
senderType: MessageSender,
subject: string,
content: string
) {
if (senderType !== MessageSender.USER) return
const hasExternalLink = extractExternalMessageUrls(content).length > 0
const riskScore = scoreMessage(subject, content)
if (hasExternalLink || riskScore >= RISK_THRESHOLD) {
throw new MedusaError(
MedusaError.Types.NOT_ALLOWED,
'Message contains content that is not allowed for security reasons.'
)
}
}One strong phrase — such as an explicit request for a password — can block a message on its own. Several weaker signals need to accumulate before the threshold is crossed. This catches messages that contain no links whatsoever but directly ask for "login credentials to verify the payment".
Subject and body are evaluated together
A convenient way to bypass a content filter is to hide the dangerous request in the message subject rather than the body — hoping the system only checks content. Artovnia's validation concatenates subject + content into a single string before scoring, so a subject like "account verification" with an empty body still gets evaluated.
Blocked at the door, not cleaned up after
When a customer message exceeds the risk threshold, the API returns a NOT_ALLOWED error and nothing reaches the database. No thread is created, the seller receives no email, the content is never stored. This was a deliberate choice — post-hoc moderation means a malicious message still sits in the seller's inbox for some time before someone removes it.
Redacting links where they wouldn't help anyway
Even a legitimate but external customer message reaching a seller shouldn't contain a clickable link to an outside domain in the notification email. Rather than discarding the whole message, the seller's email notification replaces external URLs with the text [external link hidden for security reasons] — and a permanent reminder appears at the bottom of every customer message stating that Artovnia never asks sellers for login credentials or payment details via customer messages.
There's one more, less obvious vector: nothing stops a customer from typing "Artovnia Support" or "Administration" in the sender name field. Reserved names — artovnia, support, administrator, admin — are now detected and replaced with a neutral "Artovnia Customer" in the seller's email header. Messages from Artovnia's actual administration are exempt from this rule — that's a trusted sender.
Moderation panel for anything that slips through
Automated validation doesn't replace the ability to react manually. The admin now has a new Messages tab with a list of all threads, filters, and the option to mark a thread as spam. Marking is soft — the thread disappears from the seller's and buyer's view, but the record stays in the database and can be restored with a single click if moderation misjudges the intent.
Shouldn't these safeguards have been there from the start?
That's the question I asked myself after the fact — and the answer is: no, and that's normal. Trust and safety for buyer-to-seller communication almost always emerges as a reaction to a real attack, not as a design assumption from day one. This applies to more than just small marketplaces.
Airbnb had no dedicated security team for its first three years. It was only the high-profile 2011 incident — in which a San Francisco host returned to a ransacked apartment — that led to the creation of an entire trust and safety team, a 24-hour support line, and host guarantees. Upwork and Fiverr actively detect attempts to move conversations off-platform to this day — that's exactly the same signal category I have in my scoring as "off-platform contact" — and that mechanism also emerged as a response to real abuse, not as a ready-made architectural element from the start.
The reason is practical: you can't meaningfully calibrate thresholds and signal categories without data on what an attack against your specific platform actually looks like. Before there's an incident, there's no reference point. That's why mature architecture of this kind — rules, then scoring, then moderation panel, then logging — is built in layers as the platform grows and becomes a real target. Artovnia's response differed from that pattern in one way: time. Instead of years, the entire sequence took a few days.
What I haven't done yet
- No logging of attack attempts. This specific attacker was banned and deleted immediately, so there's no remaining risk from that direction. But without logging (sender ID, IP, matched signals) it's harder to recognize someone who creates a new account and tries again. That's the next step, needed for detecting repeated attempts at scale.
- Rules are in Polish. The signal categories are universal, but the specific patterns are calibrated for Polish-language phishing targeting Polish sellers. A different buyer-to-seller communication language would require its own set of patterns.
- The threshold is a trade-off, not a guarantee. There's a rare but possible false positive: a customer who mentions "bank" or "courier delivery" in an innocent question might graze the threshold. That's better than a passed attempt to steal login credentials.
- Rate limiting is currently per-user only. A per-IP limiter, alongside the existing in-memory per-user limiter, is planned.
Why this matters for the entire marketplace, not just one message
A seller in a marketplace trusts the platform in a different way than a buyer trusts a seller. A buyer trusts one seller for one transaction. A seller grants the platform persistent access to their admin panel, financial data, and communication with their own customers — not just for one transaction. If the buyer-to-seller communication channel can be used to steal that access, it's not an incident about one message. It's a question of whether sellers can safely run their business on the platform at all.
That's why the response didn't end at "block that one message." It ended with an architecture that blocks the entire class of attack — regardless of which endpoint or service a message tries to pass through.
Conclusion
The module came back online only now, with this entire set of defenses on board — not sooner. The attacker's account was banned and deleted the same day, so the threat from that specific direction is closed.
I have no illusions that this was the last attempt of this kind. Someone will try something else, sooner or later — that's how it works. The difference is that next time they'll need to bypass two independent signals instead of one link filter, and if something slips through anyway, I'll find out from the moderation panel rather than from a screenshot sent by an angry seller.


