Opened 5 months ago

Last modified 31 hours ago

#32117 new project

Understand and document BridgeDB bot scraping attempts

Reported by: cohosh Owned by:
Priority: Medium Milestone:
Component: Circumvention/BridgeDB Version:
Severity: Normal Keywords:
Cc: dcf, phw, cohosh Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

We are aware of automated attempts to enumerate bridges in BridgeDB, but lack a more rigorous understanding of the problem.

We have detected bot requests from bridgeDB's web interface and deployed some defences by forbidding requests with headers that are commonly associated with bots, and handing out fake bridges to suspected bot requests (#31252), and

We also suspect that these bots are solving our CAPTCHAs more accurately than users (#24607).

After a recent campaign to get more volunteer bridges, we set up an experiment to test the reachability of a subset of these new bridges from a probe site in Beijing and found all new bridges in our sample to be blocked (most were blocked from the very start of the experiment): #31701

This ticket is for documenting bot behaviour and brainstorming ways to detect and analyze the automatic scraping of BridgeDB from censor-owned bots.

Child Tickets

Change History (1)

comment:1 Changed 31 hours ago by phw

We should dig deeper into the analysis over here. In particular, why is the CAPTCHA success rate for users from the U.S. higher for vanilla bridges than for obfs4 bridges?

Note: See TracTickets for help on using tickets.