Pixelated a phpBB Template by SkaidonDesigns
 
Cheapest Web Software Support Area Forum Index www.cheapestwebsoftware.com
Play with the Best, be the Best!
Cheapest Web Software
FAQ
Search
Memberlist
Usergroups
Profile
Log in to check your private messages
Log in

The Mightiest Webmaster Solutions Ever
Blogs Automater
Blogs Automater
Rocket Pinger
Rocket Pinger
Cheapest Adult Scripts
Cheapest Adult Scripts
Gallery Scraper
Gallery Scraper

Haven't registered yet? Do it here now!
BA crawling feed but not picking up 100% of the posts
Topic Tags:
 
Post new topic   Reply to topic    Cheapest Web Software Support Area Forum Index -> Blogs Automater
View previous topic :: View next topic  
Author Message
unc0nnected



Joined: 17 Dec 2010
Posts: 27

PostPosted: Fri Mar 11, 2011 2:23 am    Post subject: BA crawling feed but not picking up 100% of the posts Reply with quote
I have BA set to crawl several RSS feeds from one of my wordpress blogs. Now each feed has exactly 400 posts in it but when I get BA to crawl them it never picks up 400, it always picks up some random amount ranging from 50 up to 258.

If I go back and recrawl the same feed it comes up with nothing new.

The interesting thing is that it find the exact (wrong) same amount of items when I empty the feed and recrawl it. So I opened the feed up in firefox, copied everything into quanta and stripped everything except the title and sure enough I was left with 400 lines. I went even further and removed all items with identical titles just incase BA was filtering out what it thought were duplicates and I was still left with 308.

So for some reason BA is only seeing 50 in this instance when there are 400, or 263 or whatever the number is. It's always the same number and it's always wrong!

Any idea?
Back to top
View user's profile Send private message

Author Message
unc0nnected



Joined: 17 Dec 2010
Posts: 27

PostPosted: Fri Mar 11, 2011 4:41 am    Post subject: Reply with quote
Ok I just changed the pemalink syntax from /%monthnum%/%postname%/ to wordpress/?p=123 and I am getting a Parser error instead now that says the following:

Parser Error: Undeclared entity error at line 9, column 57

but I then pointed BA to the RSS feed(not the RSS2 feed) and now it is crawling again however still with the same wrong amount. 107 posts pulled when there are 400 and then I crawl the next feed and it finds 68 items out of 400.

Thanks
Back to top
View user's profile Send private message

Author Message
Atanasis
Owner


Joined: 22 May 2004
Posts: 4284
Location: The Net

PostPosted: Fri Mar 11, 2011 10:16 am    Post subject: Reply with quote
are your posts with unique titles? As BA is skipping posts within same feed with same title..
_________________
Thanks,
Kaktusan

Back to top
View user's profile Send private message Visit poster's website AIM Address Yahoo Messenger ICQ Number

Author Message
unc0nnected



Joined: 17 Dec 2010
Posts: 27

PostPosted: Fri Mar 11, 2011 10:20 pm    Post subject: Reply with quote
Yep, as I mentioned above I ever went out and deleted all duplicate titles to count the number of uniquely titled posts there were and it was still way way higher than the number of entries BA was pulling in(as in 308 uniquely titled posts versus less than 100 entries reported by BA)

I'll PM you some more info
Back to top
View user's profile Send private message

Author Message
unc0nnected



Joined: 17 Dec 2010
Posts: 27

PostPosted: Fri Mar 11, 2011 11:17 pm    Post subject: Reply with quote
Is there any way to disable that feature where BA only grabs uniquely titled Posts? Because I have 1 post per restaurant and there might be 20 McDonalds in a city but all of the posts are titled McDonalds and then inside the post it has the address and info for each of them. Like a phone book listing.
Back to top
View user's profile Send private message

Author Message
Atanasis
Owner


Joined: 22 May 2004
Posts: 4284
Location: The Net

PostPosted: Sat Mar 12, 2011 11:58 am    Post subject: Reply with quote
replied to your pm..

no, there is no way to turn off that, as by that is determined if a post is already downloaded from a feed or not..
_________________
Thanks,
Kaktusan

Back to top
View user's profile Send private message Visit poster's website AIM Address Yahoo Messenger ICQ Number

Author Message
unc0nnected



Joined: 17 Dec 2010
Posts: 27

PostPosted: Sun Mar 13, 2011 1:08 pm    Post subject: Reply with quote
Thank you for your help, it definitely looks to be caused by duplicate posts, that sucks that uniq was reporting incorrect values.. I've changed the title structure of my posts now and it seems to have resolved this.

Thanks again
Back to top
View user's profile Send private message

Display posts from previous:   
Post new topic   Reply to topic    Cheapest Web Software Support Area Forum Index -> Blogs Automater All times are GMT + 2 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2002 phpBB Group