|
been trying to scrape an entire site but it is not working. in the log, is is apparent that the slash after the domain is missing. Hence, all files report a code of "Has Moved!". For instance: domain.orgimages/logo.gif rather than .org/images/...
I tried adding it after the the URL but that makes no difference. Tried adding two slashes too but that made no difference.
I reset all my setting to factory default. No dice.
This is a wordpress sites so for the html pages, there are no file extensions (eg. ...org/about/ rather than ...org/about.html
Any ideas? thanks much
|