d48f0d23abd8ff0b2a276ad366ef9733.ppt
- Количество слайдов: 20
Detecting Fraudulent Clicks From Bot. Nets 2. 0 Adam Barth Joint work with Dan Boneh, Andrew Bortz, Collin Jackson, John Mitchell, Weidong Shao, and Elizabeth Stinson
Bot. Nets, Current and Future Traditional Bot. Nets Permanent malware • Infect host – Email attachments – Drive-by downloads Bot. Nets 2. 0 Ephemeral • Browser-based – Malicious advertisements – Popular web sites Click-fraud, Spam, DDo. S, Key-logging Click-fraud, Spam, (maybe DDo. S) ~100, 000 members Much larger
Browser Security Model • Same-origin policy for network access – Origin is scheme: //host: port • Write HTTP anywhere on the network – Easy using HTML forms – Except restricted ports, like 25 (SMTP) • Read from origin only – Can read some “library” formats from anywhere • Java. Script, CSS, Images, Applets, etc
Desired Properties of Policy • Can’t send spam – Writes to port 25 blocked • Can’t click advertisements – Need to READ a token to make a click count • Unfortunately…
DNS Rebinding Attacks • Circumvent browser network access policy • attacker. com points to attacker and target <policy-file-request/> <allow-access-from domain="*" to-ports="*" /> rebind DNS attacker’s server target server • Can read and write sockets to anywhere
An Experiment • We ran a Flash ad (gains socket access) – Paid $30 – 50, 951 impressions from 44, 924 unique IP addresses • 90. 6% of browser vulnerable – More if we include other rebinding attacks • $100 to hijack 100, 000 IP addresses – No click required – Impressions are cheap
Duration of IP Hijacking
A Long Tail • Some impressions last for days
Using Rebinding for Click-Fraud • Enroll as a publisher with ad network A – Publish pay-per-click ads on your site • Enroll as a advertiser with ad network B – Buy pay-per-impression Flash ads • Buy bots for $0. 001 each – Use 99% just to generate impressions on your site – Use 1% to generate ad clicks on $0. 50/per-click ads – Multiply your money by 5, repeat
Implications for Click-Fraud Defense • Simulates IP distribution exactly – Each bot an independent sample from web visitors – Black-listing IPs as bot infested meaningless • Traffic time-appropriate for IP – Human at that IP actually surfing the web right now • HTTP headers appropriate for IP – Grab real headers from request for Flash ad – Can’t get cookies, but many networks don’t use them
Distinguish Bots from Humans • Bots cannot simulate human cognition • Can’t use traditional CAPTCHAs – Too disruptive to the user experience – User has not interest in proving their humanity • Click-fraud detection a different problem – CAPTCHAs determine if this client a human – We just need estimate the proportion of humans
A Straw-Man Design • Humans click “Yes!” • Bots click at random • Ad network stats: – 3487 Yes clicks – 1271 No clicks • How many bots? – Expectation: 2542 – High probably bound an exercise for the reader
A Real Advertisement • Where will humans click? • Bots cannot simulate • Can’t trick humans into clicking – Actually need process ad
Image Recognition Doesn’t Help • Suppose the bot can identify the hot spots – Say by segmenting the image using vision techniques • In what ratio should the bot click? – Depends on the relative appeal of the hot spots – Requires human-level AI to get right • Any error a signal of bot proportion
Fraudster Has to Click on Many Ads
Ad Network can Measure Humans • At first, run ads on trusted partners – Record distribution of human click location – Easy to record (x, y) coordinates of click on web • Cheap for ad network – Was going to run ad anyway • Expensive for attacker to influence – Must use valuable bot clicks without payout – Must be clicking everywhere all the time
A Work in Progress • Need to validate diversity in distribution – Will run real ads and measure click location – How does distribution vary by screen location of ad? • Experiment with ad design – Objective: human click location hard for bot to predict • Text ads? – Less area to click and less enticing visuals – There still might be a valuable signal in click location
Conclusions • Bot. Nets 2. 0 are coming – Cheap, large-scale, ephemeral bots in the browser – Don’t require full-machine compromise – Heuristic click-fraud detection’s days are numbered • Click location can divide humans from bots – Accurate simulation requires human cognition – Easy for ad networks to deploy – More science needed to determine effectiveness
Thanks!
d48f0d23abd8ff0b2a276ad366ef9733.ppt