The Anatomy of a Phishing Link: We Analyzed 600 URLs (2026 Study)
Everyone says "don't click suspicious links." But what actually makes a link suspicious — and can a safety scanner reliably tell the difference? We decided to measure it.
We took 300 confirmed, currently-live phishing URLs from the verified feeds at PhishTank and OpenPhish, and 300 of the most-visited legitimate websites from the Majestic Million ranking, and ran all 600 through the same automated safety checks our Link Safety Checker uses — domain age, SSL/HTTPS, DNS, security blacklists, and URL heuristics. Here's what the data shows.
The headline finding: most phishing links look "safe" to a single scan
The single most important result is an uncomfortable one:
Nearly two-thirds of confirmed phishing URLs — 64% — were rated "safe" by an individual automated scan. Only about 36% tripped enough red flags to be flagged as risky.
That is not a knock on any one tool; it's the reality of how automated URL checking works in 2026. On the other side, the legitimate control group behaved exactly as you'd hope: 95% were rated safe, with an average risk score of 4.5 out of 100 — compared to 20.8 for the phishing set. So the signals absolutely separate the two populations on average. The problem is the overlap: a large share of individual phishing links simply don't look dangerous to an automated check.
What phishing links actually have in common
Here's how the two groups compared on each red flag. Each number is the percentage of URLs in that group that triggered the signal.
| Warning sign | Phishing URLs | Legitimate sites |
|---|---|---|
| Brand-new or unverifiable domain age | 82.7% | 18.7% |
| SSL / HTTPS problem | 25.3% | 7.0% |
| Failed DNS resolution | 12.3% | 2.0% |
| Suspicious URL keywords (e.g. "secure", "verify", "login") | 9.7% | 0.0% |
| Already on a security blacklist | 7.0% | 0.0% |
| Suspicious top-level domain (TLD) | 3.3% | 0.0% |
| Raw IP address instead of a domain name | 0.0% | 0.0% |
A few things jump out:
- Domain age is the strongest single signal. More than four out of five phishing URLs sat on a domain that was either brand-new or whose registration date couldn't be verified — versus fewer than one in five legitimate sites. Attackers register throwaway domains by the thousand. (See our glossary entry on new-account and domain abuse for how this fits the wider fraud toolkit.)
- The padlock is meaningless. Three-quarters of phishing URLs had a working SSL certificate. HTTPS encrypts the connection; it says nothing about the honesty of the site. "It has a padlock" has not been a safety signal for years.
- Blacklists are slow. Only 7% of these confirmed phishing URLs were already on Google's Safe Browsing blacklist at scan time. Blacklists are reactive — a brand-new phishing page is dangerous for hours or days before it's listed.
Why no single check is enough
If domain age catches 83% of phishing, why did 64% still slip through as "safe"? Because the signals don't stack neatly on the same URLs — and because modern phishing increasingly hides on infrastructure that passes every check.
A growing share of phishing doesn't live on a shady new domain at all. It lives on:
- Subdomains of legitimate platforms — free hosting on
blogspot.com,replit.app, form builders, and cloud storage. The underlying domain is old, reputable, and HTTPS-valid, so age, SSL, and reputation checks all pass. - Compromised legitimate websites — a real small-business site that's been hacked to host a login page. Every domain-level signal looks perfect because the domain is legitimate.
For these, there is no single technical tell. That's the real lesson of the data: an automated scan is a smoke detector, not a force field. It reliably catches the crude stuff — raw-IP links, throwaway domains, blacklisted pages — and reliably misses phishing that borrows a trusted host.
What this means for you
The takeaway isn't "scanners are useless." It's that safety online has to be layered, because every individual layer has holes the study just measured:
- Scan unfamiliar links for the obvious red flags — a link safety check catches the crude majority and takes two seconds. Just don't read a "safe" result as a guarantee.
- Never enter credentials on a page you reached from an unexpected message. The one thing every phishing page needs is for you to type something in. If you didn't initiate the visit, don't. (More on this in how to identify phishing links.)
- Turn on multi-factor authentication everywhere, so a phished password alone can't open your account.
- Use identity-theft protection that monitors for misuse of your personal data — because the goal of most phishing is your identity, and the safety net matters for the times a link does get through. If you're weighing it up, see do I need identity theft protection?
No single one of these would have caught every phishing link in our study. Together, they close most of the gaps.
Methodology
We sampled 300 phishing URLs at random from the verified, currently-online feeds published by PhishTank and OpenPhish, and 300 legitimate control URLs from the top of the Majestic Million most-referenced-domains list. Each URL was run once through the same five automated checks used by our Link Safety Checker: WHOIS/RDAP domain age, SSL certificate validity, DNS resolution, Google Safe Browsing blacklist status, and a set of URL heuristics (suspicious TLDs, excessive subdomains, raw-IP hosts, and suspicious keyword patterns). "Domain age" combines genuinely newly-registered domains with those whose registration could not be verified within the lookup window, which is itself a phishing-correlated signal. Figures reflect a snapshot taken in 2026 and will shift as feeds update; the study can be re-run to refresh them.
Want to check a specific link right now? Paste it into the free Link Safety Checker — just remember what the data says: a clean result means "no obvious red flags," not "guaranteed safe."
Sources & References
Frequently Asked Questions
Can an automated tool tell me if a link is a phishing site?
Partly. In our study, a single automated scan correctly flagged only about 36% of confirmed phishing URLs as risky — nearly two-thirds were rated 'safe.' Automated checks are useful for catching obvious red flags (brand-new domains, blacklisted sites, missing HTTPS), but modern phishing increasingly hides on legitimate platforms and compromised sites that pass those checks. Treat a 'safe' result as 'no obvious red flags,' not a guarantee.
What do most phishing links have in common?
In our sample, phishing URLs were far more likely than legitimate sites to sit on a brand-new or unverifiable domain (82.7% vs 18.7%), to have an SSL/HTTPS problem (25.3% vs 7%), to already appear on a security blacklist (7% vs 0%), or to contain suspicious keywords in the URL like 'secure,' 'verify,' or 'login' (9.7% vs 0%). No single one of these caught the majority on its own.
Why do phishing links have valid HTTPS padlocks now?
Because HTTPS certificates are free and automated. In our study three-quarters of phishing URLs had a working SSL certificate — the padlock only means the connection is encrypted, not that the site is honest. Attackers get certificates as easily as legitimate sites do, so 'it has a padlock' is no longer a safety signal.
How can I actually protect myself from phishing links?
Use layers, because no single check is reliable. Scan unfamiliar links with a tool for obvious red flags, never enter credentials on a page you reached from an unexpected message, enable multi-factor authentication so a stolen password isn't enough, and use identity-theft protection that monitors for misuse of your data if you do get caught. The study's core finding is that any one of these alone leaves large gaps.