This might be a subquestion of Passing arguments to process.crawl in Scrapy python but the author marked the answer (that doesn't answer the subquestion i'm asking myself) as a satisfying one.
Here's my problem : I cannot use
scrapy crawl mySpider -a start_urls(myUrl) -o myData.json
Instead i want/need to use
crawlerProcess.crawl(spider) I have already figured out several way to pass the arguments (and anyway it is answered in the question I linked) but i can't grasp how i am supposed to tell it to dump the data into myData.json... the
-o myData.json part
Anyone got a suggestion ? Or am I just not understanding how it is supposed to work..?
Here is the code :
crawlerProcess = CrawlerProcess(settings) crawlerProcess.install() crawlerProcess.configure() spider = challenges(start_urls=["http://www.myUrl.html"]) crawlerProcess.crawl(spider) #For now i am just trying to get that bit of code to work but obviously it will become a loop later. dispatcher.connect(handleSpiderIdle, signals.spider_idle) log.start() print "Starting crawler." crawlerProcess.start() print "Crawler stopped."