Welcome Guest ( Log In | Register )

 Forum Rules Joomla Scraper support
 
Reply to this topicStart new topic
> Suggenstion - Improvement: Exclude Content Between Tags, Exclude content between tags
NoToy
post Jul 7 2012, 11:40 AM
Post #1


Newbie
*

Group: Members
Posts: 13
Joined: 4-July 12
Member No.: 1,263



Dear,

I have find out that more an more sites are placing advertisement and other non related contents inside the main body text... that´s really frustrating when grabbing content.

Is there any possibility to exclude this dynamic content? perhaps using a tag delimiter like the one used to grab the main content will be great.

It will be great also to have a black list of words to remove from the main content, useful to strip from main text non dynamic content.

Hope it will be implemented on next upgrade it will be great.

Best regards!


This post has been edited by NoToy: Jul 7 2012, 12:14 PM
Go to the top of the page
 
+Quote Post
Web Design Seo
post Jul 7 2012, 01:03 PM
Post #2


Web Design Seo
****

Group: Root Admin
Posts: 4,165
Joined: 29-April 09
From: Sofia
Member No.: 1



Javascripts like adsense and other are cleaned automatically from content. Feature is added before about one year - info.


--------------------
Правила на форума | Forum Rules | How to receive support. 3D Web Design: Уеб дизайн, Seo оптимизация, Web Site Extensions, Oscommerce Addons, Wordpress plugins and Joomla Extensions. Изработка на уеб сайтове и оптимизация на сайт за търсачки и Seo услуги.
Go to the top of the page
 
+Quote Post
NoToy
post Jul 7 2012, 05:04 PM
Post #3


Newbie
*

Group: Members
Posts: 13
Joined: 4-July 12
Member No.: 1,263



QUOTE (Web Design Seo @ Jul 7 2012, 03:03 PM) *
Javascripts like adsense and other are cleaned automatically from content. Feature is added before about one year - info.


Yes and that is really important, but I´m referring to the non Javascript dynamic content, generated on the server side and shown between <div>. For example: a self made twitter time line, image galleries with text, self made text advertisement, etc. I have find out that all this contents are grabbed too and make no sense in the main text.

And as I post it will be great also to have a black list of words to remove from the main content, useful to strip from main text non dynamic content. A lot of words without sense are imported and it will be very useful.

Best regards!
Go to the top of the page
 
+Quote Post
Web Design Seo
post Jul 9 2012, 10:07 AM
Post #4


Web Design Seo
****

Group: Root Admin
Posts: 4,165
Joined: 29-April 09
From: Sofia
Member No.: 1



Just wait for next version. Exclude Content Between Tags function will be added. Not so soon, maybe in 1-2 weeks.


--------------------
Правила на форума | Forum Rules | How to receive support. 3D Web Design: Уеб дизайн, Seo оптимизация, Web Site Extensions, Oscommerce Addons, Wordpress plugins and Joomla Extensions. Изработка на уеб сайтове и оптимизация на сайт за търсачки и Seo услуги.
Go to the top of the page
 
+Quote Post
NoToy
post Jul 9 2012, 12:13 PM
Post #5


Newbie
*

Group: Members
Posts: 13
Joined: 4-July 12
Member No.: 1,263



QUOTE (Web Design Seo @ Jul 9 2012, 12:07 PM) *
Just wait for next version. Exclude Content Between Tags function will be added. Not so soon, maybe in 1-2 weeks.


Really great news! smile.gif Can´t wait to see it working!

Thank you for your great support.

Hope you can add also a blacklist for words in content...

This post has been edited by NoToy: Jul 9 2012, 12:36 PM
Go to the top of the page
 
+Quote Post
Web Design Seo
post Jul 9 2012, 01:49 PM
Post #6


Web Design Seo
****

Group: Root Admin
Posts: 4,165
Joined: 29-April 09
From: Sofia
Member No.: 1



"Ignore List" option is added long time ago - just see last tab in your scraper.


P.S. Ready, now exclude between tags function is working in latest version.



--------------------
Правила на форума | Forum Rules | How to receive support. 3D Web Design: Уеб дизайн, Seo оптимизация, Web Site Extensions, Oscommerce Addons, Wordpress plugins and Joomla Extensions. Изработка на уеб сайтове и оптимизация на сайт за търсачки и Seo услуги.
Go to the top of the page
 
+Quote Post
NoToy
post Jul 9 2012, 02:18 PM
Post #7


Newbie
*

Group: Members
Posts: 13
Joined: 4-July 12
Member No.: 1,263



QUOTE (Web Design Seo @ Jul 9 2012, 03:49 PM) *
"Ignore List" option is added long time ago - just see last tab in your scraper.


I had seen it, but this ignore list is only for tags, isn't it?
Go to the top of the page
 
+Quote Post

Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

Collapse

> Similar Topics

  Topic Replies Topic Starter Views Last Action
No New Posts Pinned: Content Authors Plugin For Payments For Joomla
paid access to user groups in Joomla 2.5
0 Web Design Seo 7,002 2nd July 2012 - 08:33 AM
Last post by: Web Design Seo
No New Posts Content Time Stamp
Content Time Stamp
2 Xlibiris 3,679 4th May 2012 - 03:22 PM
Last post by: Xlibiris
No New Posts Content Seo, Problems With Text Align
problems with readmore and text align
1 papero de paperi 3,655 12th December 2010 - 09:55 AM
Last post by: 3D Web Design
No New Posts Content Seo Hack за Wordpress Aggregator Platinum
0 Web Design Seo 4,025 20th August 2010 - 09:44 AM
Last post by: 3D Web Design


 



RSS Lo-Fi Version Time is now: 16th November 2019 - 02:19 AM
Clicky Web Analytics