How to make asynchronous HTTP GET requests in Python and pass response object to a function -
update: problem incomplete documentation, event dispatcher passing kwargs hook function.
i have list of 30k urls want check various strings. have working version of script using requests & beautifulsoup, doesn't use threading or asynchronous requests it's incredibly slow.
ultimately cache html each url can run multiple checks without making redundant http requests each site. if have function store html, what's best way asynchronously send http requests , pass response objects?
i've been trying use grequests (as described here) , "hooks" parameter, i'm getting errors , documentation doesn't go in-depth. i'm hoping more experience can shed light.
here's simplified example of i'm trying accomplish:
import grequests urls = ['http://www.google.com/finance','http://finance.yahoo.com/','http://www.bloomberg.com/'] def print_url(r): print r.url def async(url_list): sites = [] u in url_list: rs = grequests.get(u, hooks=dict(response=print_url)) sites.append(rs) return grequests.map(sites) print async(urls)
and produces following typeerror:
typeerror: print_url() got unexpected keyword argument 'verify' <greenlet @ 0x32803d8l: <bound method asyncrequest.send of <grequests.asyncrequest object @ 0x00000000028d2160>> (stream=false)> failed typeerror
not sure why it's sending 'verify' keyword argument default; great working though, if has suggestions (using grequests or otherwise) please share :)
thanks in advance.
i tried code , work adding additional parameter kwargs print_url function.
def print_url(r, **kwargs): print r.url
i figured wrong in other stackoverlow question: problems hooks using requests python package.
it seems when use response hook in grequests need add **kwargs in callback definition.
Comments
Post a Comment