|
Haven't registered yet? Do it here now!
|
aj
Joined: 23 Mar 2008 Posts: 36
|
|
|
|
|
|
|
Posted: Mon Oct 27, 2008 3:08 pm Post subject: Posts being imported twice |
|
|
|
|
|
|
|
|
|
|
Hello
I've noticed posts from certain feeds are imported twice, especially feeds from Ragecash and Deecash:
5576 Stealing Redhead Teen Punished 22 Oct 2008, 16:37
5728 Stealing Redhead Coed Punished 22 Oct 2008, 16:37
Sample feed is:
http://blog.spankedandabused.com/feed/?nats=NDMwMDo0OjE2,0,0,0,
Even though those posts seem to be morphed it's annoying to see the same pics on the blogs fed by these feeds.
I've already talked to Ben from Ragecash, he doesnt see any problem...
Is this a problem on my side or is there something wrong at the sponsors feeds?
How can I avoid this?
Greetings
aj |
|
|
|
|
|
|
|
Atanasis Owner
Joined: 22 May 2004 Posts: 4284 Location: The Net
|
|
|
|
|
|
|
Posted: Mon Oct 27, 2008 7:26 pm Post subject: |
|
|
|
|
|
|
|
|
|
|
well, BA/BO determine the uniqueness of the posts by their titles. If in any feeds the titles of the posts morph themselves, there's no way to prevent BA/BO not to crawl them twice.
Its weird why those sponsors morph and the titles.. _________________ Thanks,
Kaktusan
|
|
|
|
|
|
|
|
aj
Joined: 23 Mar 2008 Posts: 36
|
|
|
|
|
|
|
Posted: Tue Oct 28, 2008 2:16 am Post subject: |
|
|
|
|
|
|
|
|
|
|
So I will have to delete all dupes manually *grmpf
I've discussed this with Ben and I guess he will contact you soon |
|
|
|
|
|
|
|
topten
Joined: 06 Sep 2007 Posts: 37
|
|
|
|
|
|
|
Posted: Thu Oct 30, 2008 1:23 am Post subject: |
|
|
|
|
|
|
|
|
|
|
arent posts imported once and once only ?
what you are saying kaktusan is that we cant use sponsors who have morphing rss feeds
this cant be right surely as most sponsors offer morphing rss feeds which would make BA useless with them wouldnt it ? _________________ hi |
|
|
|
|
|
|
|
Atanasis Owner
Joined: 22 May 2004 Posts: 4284 Location: The Net
|
|
|
|
|
|
|
Posted: Thu Oct 30, 2008 8:33 am Post subject: |
|
|
|
|
|
|
|
|
|
|
Here's how it works
1. You import a feed.
2. Script spiders the feed and extracts all posts from it and puts them in the
database
3. You or script recrawls the feeds. To make sure it won't put in the database the same posts as last time, the script needs to determine by something if the post is unique or not. All rss to blog scripts use the title of the posts to determine it, since all rest fields might change due to scripts or edits.
So BA is checking each post title crawled from feed now if it matches anything in the database.
Morphing feeds refer to the body of the post. Or if and the title is morphing then feeds should morph per webmaster, not on each feed refresh.. _________________ Thanks,
Kaktusan
|
|
|
|
|
|
|
|
aj
Joined: 23 Mar 2008 Posts: 36
|
|
|
|
|
|
|
Posted: Thu Oct 30, 2008 1:05 pm Post subject: |
|
|
|
|
|
|
|
|
|
|
Wouldnt it be "safer" to also check on the rss-standard-item like <guid> or <pubDate>?
I know these items dont need to be used on a feed, but if they are implemented it would be the easiest to way to avoid crap like this.
Is there something like a "guidline for sponsor hosted feeds"? If there was something like that one could point any sponsor to it and introduce them to hardcore bloggers needs |
|
|
|
|
|
|
|
Atanasis Owner
Joined: 22 May 2004 Posts: 4284 Location: The Net
|
|
|
|
|
|
|
Posted: Thu Oct 30, 2008 8:20 pm Post subject: |
|
|
|
|
|
|
|
|
|
|
the bad thing is that there are no standarts. Everybody makes their feeds as they want. Then when their users start complaining it doesn't work, then they start adopting it so the users are happy.. I have myself dealt with really lots of different and odd feeds..
as about the guid and pubdate the thing is that they are not safe too. For guid i have seen really lots of crap and on most feeds it will cause problems if i include it to determine uniqueness by it.
Pubdate could be a factor, but i had some thoughts about it which i don't remember right now. Will try to remember and see if it will be wise to use it.. _________________ Thanks,
Kaktusan
|
|
|
|
|
|
|
|
aj
Joined: 23 Mar 2008 Posts: 36
|
|
|
|
|
|
|
Posted: Thu Oct 30, 2008 11:06 pm Post subject: |
|
|
|
|
|
|
|
|
|
|
Have I mentioned I still love your work, dude? |
|
|
|
|
|
|
|
Atanasis Owner
Joined: 22 May 2004 Posts: 4284 Location: The Net
|
|
|
|
|
|
|
Posted: Fri Oct 31, 2008 7:23 am Post subject: |
|
|
|
|
|
|
|
|
|
|
you just did _________________ Thanks,
Kaktusan
|
|
|
|
|
|
|
|
topten
Joined: 06 Sep 2007 Posts: 37
|
|
|
|
|
|
|
Posted: Fri Oct 31, 2008 10:13 am Post subject: |
|
|
|
|
|
|
|
|
|
|
thanks mate _________________ hi |
|
|
|
|
|
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
Powered by phpBB © 2001, 2002 phpBB Group
|
|
|
|
| |