Help - Search - Members - Calendar
Full Version: Joomla Scraper, Grabber For Joomla
Web Design Seo Forum > 3D Web Design English Forum > Joomla Scraper
Pages: 1, 2
Web Design Seo
3D Web Design present best aggregator in the world with article spinner and grabber: Joomla Scraper, developed by 3D Web Design Scraper for Joomla.

Joomla Scraper is high technology grabber and Aggregator that can aggregate and import in database of Joomla many rss feeds and FULL TEXT of original content item (where point link to content from rss feed).

Цитат
Joomla Scraper is First 100% Google Panda/Penguin safe aggregator in the world!


Differences between Joomla Scraper and Aggregator Platinum:

First of all, Joomla Scraper have FULL K2 integration - import images (BMP files are NOT supported!) and tags in k2. Aggregator Platinum have only partial integration with k2.

Joomla Scraper have all of functions known from Joomla Aggregator Platinum plus couple more:
- have ACL (only joomla 3 and Joomla 2.5 version)
- have spin format support
- have full integration with K2 component
- have integration with best tags component for Joomla Advanced Tags - post tags directly in tag component.
- supports JomSocial (joomla 2.5 version only, JomSocial 3 only)
- support Kunena forums (all Kunena 2 and Kunena 3 versions)
- can work with big synonyms databases to make to 100% unique content
- can grab and insert in database full content instead of short content in rss feeds
- can strip javascript and selected html tags from imported content
- have custom lightweight rss parser developed by 3D Web Design (joomla 2.5 version only)
- have limit function (only joomla 2.5 and 3.0 version)
- have Preview function (only joomla 3 and 2.5 version). Now you can preview items before import in joomla database.
- have filter by keyword function (only joomla 2.5 and 3.0 version) - to import only items that contain only some of keywords included in list.
- in Joomla Scraper is added internal duplicate content protection based on links of imported items (only joomla 2.5 and 3.0 version)
- Ram and "time used" debug functions (only joomla 2.5 and 3 version)
- New shuffler with improved alghorithm to shuffle content automatically


Цитат
By default parser in "Joomla Scraper" is latest simplepie. Our tests say us that rss feeds will work better in Aggregator with old version of simplepie. Use new version of simplepie parser ony if you get "deprecated" or other error!


Algorithm of scrapper is optimized, lightweight and robust.




Latest version History:
14.09.2018: Joomla Scraper v.1.9.9.1 released. No new functions, only updated to work with PHP 7.2. Now you can use all php versions between php 5.5 and php 7.2.


12.01.2016: v.1.9.9 for Joomla 3. Changes:
1. New option: Added new option to limit length of imported content. Works with or without Scraper. New option name is "Limit content lenght".

Click to view attachment

If you enter in this field 25000, imported content will be limited to 25000 characters. Keep in mind that result as clear text can vary if you use strip HTML tags option. If you enter ZERO in "Limit content lenght", no matter how many content is inside imported rss feed, you will import only link to original article.


2. New function: New function for images seo - relevant alt is added to all of imported pictures. There is no new option in feed configuration - alt tag is automatically added to imported in content images, to Intro image and to Full article image (see screen).

Click to view attachment


06.10.2014: Joomla Scraper v.1.9.8 for Joomla 3 is released. This version is bug fix, is updated only Joomla 3 version!
27.05.2014: v.1.9.6 for joomla 3 - FOUR new functions added (is updated only joomla 3 version). Now you can use aggregator in https-only sites with ssl certificate.
And you can limit imported in articles pictures by size (px) and by number of pictures. Guide: How to import beautiful "clear text only" full articles with only one picture.
21.05.2014:[/b] v.1.9.5.1 for joomla 3 - bug fixes. 1. Fixed problem with cyrillic titles containing quotation marks. 2. Fixed problem with cyrillic url alias transliteration in K2.

20.05.2014: v.1.9.5 for Joomla 3 (only Joomla 3 version is updated). Fixed is bug when web site is in sub directory "Class JFile Not Found". New options for images in articles (com_content only): Setting images in content articles Automatically over rules. Now default image or downloaded from rss feed image can be setted as Intro Image and/or Full article image in imported content articles.



25 February 2014: Joomla Scraper v.1.9.4 for Joomla 3 is released. Now Joomla Scraper for Joomla 3.x support Kunena 3.
24 February 2014: Joomla Scraper v.1.9.6 for Joomla 2.5 is released. Is added Kunena 3 support. Only version for Joomla 2.5 is updated!
28 January 2014: Fixed bug (sql error while sending Email Report) in both Joomla 3 and Joomla 2.5 Scraper versions.
20 January 2014: Joomla Scraper for Joomla 3 minor version update - improved compatibility with Joomla 3.2.
17 January 2014: Fixed bug in K2 importer (k2Table missing) in both Joomla 3 and Joomla 2.5 version. Added links to guide and service for free cron job service to use with Scraper.
17 December 2013: v.1.9.3 for Joomla 3: Bug fixes with manual import buttons.
09 December 2013: v.1.9.2 for Joomla 3 and v.1.9.5 for Joomla 2.5 released. New Scraper function "remove script tags".
25.06.2013: v.1.9.1 for Joomla 3.0. Minor bug fix in "SimplePie" parser. Is updated only Joomla 3 version!
10.06.2013: v.1.9 for Joomla 3.0 and 1.9.3 for Joomla 2.5: New function added: K2 tumbnails are now supported. If function is switched on, pictures will be resized automatically and imported in K2. Attention: this will cost much more execution time and memory!
23.04.2013: v.1.8.9 for Joomla 3.0, 1.9.2 for Joomla 2.5 and 1.7 for Joomla 1.5: Bug fix in cron job.
21.03.2013: v.1.8.8 for Joomla 3.0 and 1.9.1 for Joomla 2.5: Internal "duplicate content protection" function inside Joomla Scraper instead of included in joomla duplicate content protection based on aliases (in com_content). Now you can use synonyms in titles and to import every feed as often as you want - no one item will be imported twice.

Цитат
Joomla have internal "duplicate content protection" based on titles of content items. But if you use spinner or synonyms replacement in titles, titles will be unique every time and Joomla's internal "duplicate content protection" will not work. Now in Joomla Scraper is added internal duplicate content protection based on links of imported items. In this way with or without to use spinning in titles, one article will be imported only once.


11.03.2013: Simplepie bug fixed (only new version of parser) in combination with feeds with non-utf-8 encoding.
08.03.2013: v.1.8.7 for Joomla 3.0: latest updates (better shuffler, import time and memory usage debug) are now available also for Joomla 3.0 version.
26.02.2013: v.1.8.9 for Joomla 2.5 and v.1.6.8 for joomla 1.5. Added import time and memory usage functions in: manual import, on feed preview page, in email notification and in cron. With this improved statistic you can check and diagnose import problems and measure performance of different feeds between different parsers.
25.02.2013: v.1.8.8 for Joomla 2.5 and v.1.6.7 for joomla 1.5. New, improved shuffler in spinner to make better human readable texts.

01.02.2013: v.1.8.6 for Joomla 3.0 is released. Version for Joomla 3 support import only in content and K2!

01.02.2013: v.1.8.5. Filter by keyword function added. More info for this function is bottom.
29.01.2013: released new version 1.8.4. Added two new functions: 1. To preview feed before import. If you select one or more feeds and click on button "preview", you will see the result content that will be imported. 2. To limit imported from feed items. If you write in field "feed limit" some number, for example 5, only first 5 items from this feed will be imported in joomla database.
15 January 2013: V.1.8.3: Pagination bug fix in Joomla 2.5 version.
09 January 2013: Joomla Scraper Version 1.8.1 for Joomla 2.5. It allows the latest SimplePie (ver. 1.3.1 - PHP 5.2) to be used in addition to the obsolete simplpie 1.1.2. In every feed configuration now you have choice with old simplepie parser, new simplepie and custom rss parser.
08 January 2013: V.1.8.0 for Joomla 2.5. It allows images without extension to be imported as well.
06 January 2013: bug fix in cron in both versions.
19 December 2012: v.1.7.8 for Joomla 2.5 and v.1.6.5 for Joomla 1.5 - Import images in K2 tab "image" bug fix.
08 November 2012: v.1.7.7 for Joomla 2.5 - bug fix release. The fixed bugs are as follows:
1. The RSS parser didn't convert all the result fields in UTF-8;
2. The article tags were not correctly added using the Advanced Tags component.

07 November 2012: v.1.6.4 for Joomla 1.5. Better seo functions. New configurable option: Automated extraction of top key phrases from content of article. These phrases are counted and top key phrases are added as tags in content item in advanced tags component.
19 September 2012 - v.1.7.6: Added JomSocial and Kunena 2.0 support. Functions to import in Jomsocial and in v. 2.xx of Kunena are only for Joomla 2.5 version of Joomla Scraper!
10 September 2012 - v.1.7.5: In Joomla 2.5 version is added developed by 3D Web Design custom Rss parser - much faster and less memory consuming than SimplePie. Added is also ACL. Updates are only for Joomla 2.5 version.
16 July 2012 - v.1.7.3 and 1.7.4: Exclude content between html tags in scraper and full K2 integration
25 January to 24 February 2012 - v.1.6.9 to 1.7.2: Spin format and huge synonyms databases support, optimization of code for speed.


Standart functions in Joomla Scraper (presented in Joomla Aggregator Platinum also):

Unlimited rss sources
Can post in K2
Can generate 100% unique content
Random html code replacement function
Tags generated from title
Insert tags in integrated tag component
Custom tags at end/start of tags
Random tags
Synonym replacement function
Content combine functions - Add and change html code before/after every content article
Random choice of html code to add
Download images and save them internally, BMP files are NOT supported!
Resize images
Send Email reports
Is automated over cronjobs

New since 16.05.2011: Spin content automatically with integrated article spinner. You can test function article spinning also for free here: article spinner.


Options added in grabber
1. On/Off of scraper functions;
2. Starting HTML - a HTML portion which indicates the beginning of the full text article on the sources' web page, e.g. <div id="article". If left empty, the grabber will get a portion of the RSS feed item and will try to use it to determine the actual starting position of the article's full text.

3. Ending HTML - a HTML portion which indicates the ending of the full text article, for example you can add:
Код
<div class="afterarticle"

And scraper will grab full content from start of rss feed to this HTML tag in page of original content. This field must not be empty. If left empty, the grabber will exit.

4. Search text length - the number of symbols taken from the beginning of the RSS feed item used to detect the actual starting position of the article's full text. This field must be used only if Starting HTML is not supplied!

5. Strip tags - whether to strip the article's full text from the HTML tags or not.

6. Allowed tags - these tags will not be stripped from the article's full text. Usually you can allow any html tag, for example:
Код
<img>, <strong>, <p>, <br>, <br/>


7. Detect JS redirect - some RSS feed sources tend to hide their full article's from grabbers like ours by supplying in their feeds a Read More URLs which redirect the user to the page that contain the article. If this option is enabled the grabber will try to detect this situation and to get the end link to article instead of the redirection page.

8. Exclude content between html tags. Now you can use two or more html tags - content between these tags will be ignored.
9. Can import in Kunena and Jomsocial.
10. Can preview rss feed before import.
11. Can limit imported from feed items.


Functions of Joomla Scraper
All functions are described in different settings tabs in component. Here is posted some more information about these functions and examples of use.

Pictures from all functions of Joomla Scraper are below

Custom Rss parser:


To import rss feed, you can use choice of three parsers - old version of Simplepie, new Simplepie (v.1.3.1 Released on 30 October 2012) and our "Custom Rss parser". "New Simplepie" is buggy and is old simplepie, but with fixed deprecated errors that are shown on servers with newest php versions.

Цитат
Note that Simplepie parser is not developed from us, is Open Source free rss parser. If you have some questions about simplepie or you find some issue with this parser, post it here: https://github.com/simplepie/simplepie/issues


Scraper version have Custom Rss parser developed from us - much faster and less memory consuming than SimplePie. With our parser you can work on shared host with less memory and to import huge full feeds with 150-200 content items and many pictures inside.

Possibilities with custom Rss parser:
Simplepie can't recognize non-standart feed types, work only with atom, rss 0.9, 1.0 and rss 2.0. When you try to import in joomla database some non-standart xml file with simplepie parser, nothing is happened.

With our custom Feed parser you can insert from rss only selected fields in content, so, you can use non-standart feeds like xml files with products and others that are not parsed from simplepie. But Custom Rss parser need configuration for every feed and some HTML knowledge.

Attention! Performance differences between parser versions!
Scraper version have 3 different parsers - Two versions of simplepie (old and new) and one custom parser. Different simplepie parsers have different performance also.

By default parser in "Joomla Scraper" is latest simplepie. Our tests say us that feeds will work better and faster with old version of simplepie. Use "new version" of simplepie parser ony if you get "deprecated" or other error! Recommended is to use old simplepie or our Custom Rss parser (configuration needed).

Preview, copy, new and edit functions




General tab and K2 functions



Import images in K2 work with jpg, gif and png only, Bmp and others are not supported.

Limit number of imported articles functions:



You can limit number of imported items from every feed.

Filter by keyword.



You have filter to configure one or more keywords in this field. Items from feed will be imported only if some of keywords is find in title or in item content. This option allow you to filter results by keywords and to build better thematic websites. You can configure one feed many times with different keyword filters and to import results in suitable category.

Цитат
K2, Kunena and JomSocial tabs are shown only if you have these extensions installed in your Joomla!



Scraper functions in Joomla Scraper after v.1.5.5:

"Permalink - search for" and "Permalink - replace with" fields. This is needed for rss feeds that use redirect - in feed link to contnet item is different from real content url. Now with these fields scraper know the right link.

Example Feed: BBC
Код
http://newsrss.bbc.co.uk/rss/sportonline_uk_edition/football/rss.xml


- Improved grabber functions.
- Scraper clean Javascript automatically
- Possibility for Scraper to look for multiple HTML markup sets in the same RSS feed! Some feed sources tend to use several sets of HTML markup in their articles. Now you can all their content - just add a start string - end string pair for each HTML markup set in the "Starting string" and "Ending string" scraper settings. Separate each pair with the '|' character.


Цитат
You can configure in scraper two or more parameters for start and end string. Example:


start string:
Код
<div class="start">|<div class="other">


end string:
Код
<div class="end">|<div class="otherend">


Real example for feed from bbc:
Starting string:
Код
<h1 >|<div class="start">


Ending string:
Код
<div class="bookmark-list">|<div class="end">


Screen from configuration:



Guide: How To Configure Scraper


Trackback and publish functions:

Simple article - seo content structure (when content is imported from aggregator):


Split introtext option: This function insert "Read more" tag at different places of content item. Options are:
- No intro text ("Read more" tag is inserted at start of content item)
- After "before item" ( "Read more" tag is inserted after content item and before content that is inserted from "before item" field of "Content seo")
- Before "After item" - After content but before "after item" content.
- Only introtext (all text inserted from feed is inside fulltext)



Protect from duplicates function look at url to full publication in Rss feed and store all already imported urls in database. Option "automatically delete protection info older then" (in minutes) is to clear this database table from time time - huge tables are slow.

Attention: "Split introtext: After N chars" function work well only with clear text!
"After N chars" function can't recognize is content in some container (there can be many html tags into each other), in what container and to close all html tags automatically. So, if you import content with html, is better to use other options for Split introtext, not "After N chars".


Content Seo functions:



Fields "before item" and "after item" in "Content seo" tab support HTML and text in spin syntax. You can add your html code or you can add random html if you use spin format - both ways at bottom are supported:
Код
{keyword|keyword2|keyword3}
[keyword|keyword2|keyword3]


Fields "before title" and "after title" support only clear text and spin syntax.


Synonym replacement functions:



Component can work with huge synonyms databases, work well with databases of to 200,000 synonyms and more. Synonyms can be inserted in joomla database and every feed can be configured to work with different synonym database (example: feed one can work with english synonym database, feed two will work with german synonym database, feed three with russian synonyms database).

Note that you need to create your own synonyms database to use function "synonym substitution". Synonym database is not included in Joomla Scraper.

If you want, you can purchase english synonym database formatted to work with component (price 25 usd), more info here: http://3dwebdesign.org/forum/english-synon...ice-25-usd-t735

If you want to create it yourself: Example Synonym Database is published here: http://3dwebdesign.org/forum/index.php?showtopic=1171 - this example is imported with component installation in joomla's database. You can add more lines to this database and use it.

Rss feed translation over Google translate functions:

Цитат
We dont guarantee that Google translate functions work! Google translate Api now is paid service!




Download images functions:



You can download images local in your website. Attention: you may need 60 seconds and more in max_execution_time setting of php to use this function! Download images work with jpg, gif, png and files without extension.

Meta and tags functions:



Meta tags must be switched ON to work integration with Advanced Tags component. Imported as meta tags keywords are imported as tag in Advanced Tags also.

Cron functions:



Email report functions:



Cache functions:



Ignore list functions:



You can add list with words to ignore. These words will be not inserted in imported content.

Kunena and Jomsocial integration functions:



When importing into JomSocial, the following settings are available:
- Profile owner;
- Post where: "As status update" or "On wall";
- Allow comments.

When importing into Kunena 2.0, the following settings are available:
- Forum
- Author ID: the user ID of the posts' author
- Thread ID (optionsl): If set, all the articles will be imported as new posts in this thread. If 0, each article will be posted as separate thread.


ACL functions:



ACL is available only in Joomla 3.0 and Joomla! 2.5 version!

Possibilities for Joomla Scraper

1. With these options activated in our Scraper you can grab full content from every website. You can scrap content from facebook page, from twitter or from every other website.
2. You can grab only this content that you want and remove html tags that you dont want (like links for example).
3. You can grab from scrapped website not only content of current article, you can grab for example article with comments.
4. You can scrap many content from many websites and all of this will work automatically by schedule for every website.
5. You can change content to be unique and content imported in your website will be ranked higher in google (google will not recognize your content as duplicate)


Цитат
Note that 3D Web Design does not encourage to steal content and recommend to use Joomla Scraper to grab full content only after permission of owner of current content.



Requirements:
Recommended requirements of php for Scraper version are:
- access to php.ini settings
- max_execution_time = 300 (min 60 seconds will be needed to use scraper and download images functions)
- memory_limit = 128M or more (can work with 32 mb and 645 mb, but is recommended to use 128 or more mb to use all scraper functions and import of multiple feeds with crons)

Extension will work on every shared host with access to php.ini settings and memory limit 64 mb or over, but We recommend using of VPS server.

Automatic import with cronjobs - steps:



You must configure cron on two different places:
- in every rss feed in Joomla Scraper configuration (different settings for every feed)
- in cronjob configuration in your hosting control panel (set to run every 2, every 3 or every 5 minutes)

1. Create file with name mycron.php. Code for this php file:

Код
<?php
$a = file_get_contents('http://yoursite.com/administrator/components/com_aggregator/cron.aggregator.php');
?>


2. Upload file mycron.php in your public_html. Create cronjob in your cpanel with command:
Код
/usr/local/bin/php -q /relative path to public_html/mycron.php >> /dev/null


3. Set cron to run in your control panel (cpanel or other) every 5 or 3 minutes (recommended). In Unix Style view this should be:

Код
*/3 * * * *


4. Configure your feeds.
5. Test every feed manual first!
6. Try to import every feed without extras like scraper and image download first! More functions switched on in every feed = more load for your server!
7. When you are sure that all feeds configuration is ok, configure cron part. Before using cron, please, read this thread: Cron Job configuration and server load.

Prices:
Joomla Scraper with three upgrades subscription - 49 USD. Download and Online purchase: Joomla Scraper.


Update from previous version: How to update component?

Update from Aggregator Platinum:
Joomla Aggregator Platinum and Joomla Scraper are full compatible. If you upgrade, just upload new files over ftp or install new extension over joomla installer. Your configured rss feeds will stay untouched.

Update from previous version of Joomla Scraper:
Just upload over new version.


Price of Jooma Scraper is only $49!



Important: How to earn money from Joomla Scraper affiliate?
Updates and more licenses - information and prices.
Web Design Seo
Advanced Tags for Joomla 1.7 is released today. I recommend to download update and to use tag component with joomla scraper for Joomla 2.5 and J1.5.
Web Design Seo
Today are added more two functions to Joomla Scraper. Options are for integration with K2 and work only when aggregator post in K2!

1. Automatic import of images in K2 tab "Images".
When is switched on, and in html code is img tag, first image tag is imported in tab "Images" in K2.

1. Automatic import of keywords in K2 "Tags".
Aggregator import automatically keywords from title of current content item. When function is switched on, keywords are imported automatically as tags in K2.

Detailed info about integration with K2
Web Design Seo
Latest version of Joomla Scraper for Joomla 1.7 is 1.6.9, released on 5 January 2012.
Latest version of Joomla Scraper for Joomla 1.5 is 1.5.8, released on 5 January 2012.
Web Design Seo
Latest changes in Joomla Scraper:

1. spin format support in "before content" and "after content" fields
2. spin format support in "before title" and "after title" fields
3.Component now can work with big synonyms databases, work well with databases of to 100,000 synonyms.

Pictures are updated and are included in first post.

More info about latest updates
Web Design Seo
Joomla Scraper is updated again yesterday. Updated are both versions - for joomla! 2.5/1.7 and for Joomla! 1.5.

Added is better readmore tag placement and function to strip selected html tags outside scraper. Result - better stripping of not needed html tags. For example now you can strip only links, with or without using scraper.

See here: demos for scraper work.
Nikos
Hello,
does Joomla Scraper imports embed videos (flvs) from rss feeds, can I test it on a demo?

Thank you
Web Design Seo
Yes, embed videos and flv.

Nikos, on demo site are uploaded very old versions of aggregators (more than 8-9 months old), without latest 5-6 changes. Yes, you can test this function in demo: flash, movies and youtube videos import will work.

Keep in mind only that latest version work 3-4 times faster with article spinner and with synonyms - work with big synonyms databases to 100,000 words insead of to 800-1000 words in demo.
Nikos
Thank you for your reply, we need to import from Blogger to Joomla, which is the right component to buy, I suppose the Joomla Scraper, can you advise please? I need also to have the correct configuration for the embed videos in articles, which component to try on your demo? Can I send you a sample rss feed to import in a test feed to add?

My only concern is the right configuration.

Thank you
Web Design Seo
Only import can be done with every extension. Platinum and scraper can import and flash and videos. To make choice: Please, see comparison of all aggregators.
Nikos
QUOTE (Web Design Seo @ Apr 11 2012, 07:09 AM) *
Only import can be done with every extension. Platinum and scraper can import and flash and videos. To make choice: Please, see comparison of all aggregators.

Hello,

would you help me to configure in our server (if I need help), as we will buy the Joomla Scraper for Joomla 1.5 version


Thank you
Web Design Seo
Yes, of cross, we will help if you need this. Is pretty simple - there are only two settings in php.ini that can be changed in typical configuration of most servers: max_execution_time and memory_limit.

Be sure only that you have access to php.ini settings of server.


P.S. When you buy Joomla Scraper you can receive both versions of component - joomla 1.5 and 2.5. When you update your joomla to 2.5 all will be working also.
cardin
I am unable to make the synonyms feature work. The words are not replaced. Here's a sample of my synonyms.

new=recent;prevent=avoid;according to=as per;evaluated=assessed;select=choose;selected=chose;chosen=selected;excellent=fi
nest;do=perform;exact=accurate;explain=clarify;important=vital;work out=train;

Please help.

Thanks

Edited: Sorry posted this in under wrong topic should be under joomla

http://3dwebdesign.org/forum/index.php?sho...&#entry3683
Web Design Seo
You have enter these synonyms in text field or in database? If are in text field in feed configuration must be selected "As defined"



And what is your version?
cardin
I am using Aggregator Scraper 1.7.0 in Joomla 2.5.

I am unable to make the synonyms feature work. The words are not replaced. Here's a sample of my synonyms.

new=recent;prevent=avoid;according to=as per;evaluated=assessed;select=choose;selected=chose;chosen=selected;excellent=fi
nest;do=perform;exact=accurate;explain=clarify;important=vital;work out=train;

I am using the synonyms in text field has have selected 'As Defined' synonyms in content.

Please help.

Thanks.
Web Design Seo
Please, check that reqirement of component are achieved - memory limit, execution time. If yes, open link from my signature in every post "How to receive support" and make all of this, step by step. If no errors, send me login data for your site to test.
Ivan Stamenov
Цитат(cardin @ Jun 12 2012, 08:55 AM) *
I am using Aggregator Scraper 1.7.0 in Joomla 2.5.

I am unable to make the synonyms feature work. The words are not replaced. Here's a sample of my synonyms.

new=recent;prevent=avoid;according to=as per;evaluated=assessed;select=choose;selected=chose;chosen=selected;excellent=fi
nest;do=perform;exact=accurate;explain=clarify;important=vital;work out=train;

I am using the synonyms in text field has have selected 'As Defined' synonyms in content.

Please help.

Thanks.


Hi, cardin. There was a bug in the controller causing this behaviour. The fixed version will be available later today or tomorrow.
Web Design Seo
cardin, i will send you next version in next hour. Problem is solved and new version is tested on two separate installs - one test and one live. If you find other bug in component, please, inform us.

Thank you for your help to make our component better!
Web Design Seo
Joomla Scraper is updated again. Latest version for Joomla 2.5 is 1.7.4, released on 16 July 2012. Latest version for Joomla 1.5 is 1.6.3 (16 July 2012).

Added is option to Exclude content between html tags and are fixed bugs in sentences shuffler when using intro text/full text option.
Web Design Seo
From today with every new copy of Joomla Scraper will be installed example synonyms database. If you upgrade component from previous version, use published here sql query - open phpmyadmin and run this query: example of synonyms database to use in Joomla Scraper

And here you can purchase english synonym database formatted to work with Joomla Scraper.
Ivan Stamenov
A new custom RSS parser is added to the Joomla 2.5 version of the component. This RSS parser is our work and is much faster and less memory consuming than SimplePie and is suitable for parsing large feeds.




With our custom Feed parser you can insert only selected fields from rss feed in content, so, you can use non-standart feeds like xml files with some products and others that are not parsed by default from simplepie.


There are some settings that need to be set for every feed, though. The parser needs to be told where (in which XML tag) to find the relevant content. The following notation must be used:

tag[.required_property_name1:required_property_value1]...[.wanted_property_name]

E.g.:
Код
<link>http://the.link.we.want/</link>
=> set the link tag setting to: link, "get the content of the link tag"

Код
<link rel='alternate'>http://the.link.we.want/</link>
=> set the link tag setting to: link.rel:alternate, "get the content of the link tag with rel property = alternate"

Код
<link rel='alternate' alt='permalink' href='http://the.link.we.want/' />
=> set the link tag setting to: link.rel:alternate.alt:permalink.href, "get the content of the href property of the link tag with rel property = alternate and alt property = permalink"


New version is already uploaded in our file directory.
Ivan Stamenov
Version 1.7.6 of the Scraper is available. Now JomSocial and Kunena 2.0 forums are supported.



When importing into JomSocial, the following settings are available:
- Profile owner;
- Post where: "As status update" or "On wall";
- Allow comments.

When importing into Kunena 2.0, the following settings are available:
- Forum
- Author ID: the user ID of the posts' author
- Thread ID (optionsl): If set, all the articles will be imported as new posts in this thread. If 0, each article will be posted as separate thread.
Web Design Seo
ACL and import in Jomsocial and Kunena are only for Joomla 2.5 version. Import of Kunena is developed to work only with Kunena 2.0 and over. May be will work with older Kunena versions also, but ia not tested and we don't guarantee this.

New version of Joomla Scraper is already avalable for purchase.
Web Design Seo
Today is released new version for Joomla 1.5 - Joomla Scraper 1.6.4. Only Joomla 1.5 version is updated!

New version add automated SEO functions on next level!

Цитат
New function is added with integration with component Advanced Tags (also updated today). Until now, component add as tags only one word keywords extracted from title.

Now, when automatic articles are added, tag component count phrases in title and in body of article and add configured from you number of top key phrases as tags. In this way you can have more in long tail keywords and key phrases.


In this way will be updated and version for Joomla 2.5. This update pending in next 3-4 days.
Ivan Stamenov
New version 1.7.7 is now here. v.1.7.7 of Joomla Scraper is bug fix release. The fixed bugs are as follows:

1. The RSS parser didn't convert all the result fields in UTF-8;
2. The article tags were not correctly added using the Advanced Tags component.

SilverOne
QUOTE (Web Design Seo @ May 18 2011, 04:09 PM) *
Pictures from all functions are below


Custom Rss parser:



In the picture above, according to example give is it word 'link' we have to fill in the field or the true link (=http://the.link.we.want/)?
Web Design Seo
You must fill in there the field.
SilverOne
QUOTE (Web Design Seo @ Dec 5 2012, 04:05 PM) *
the field


You have not answered to my question, also I repeat it:

In the picture above, according to example given is it the word 'link' we have to fill in the field or the true link we have to fill in the field (=http://the.link.we.want/)?
Web Design Seo
You must fill in there the field.

You must open rss feed, than you must open xml code of this rss feed (press ctrl+u in browser) and must say parser: Hey, content from field "link" in feed code must go to field in joomla scraper parser options "link".

Is now ok smile.gif
Web Design Seo
We find bug and Scraper version will be updated in next hours: If "download images" function is switched on, Joomla Scraper don't import images in K2 tab "image".

Today is released fix for this case and scraper are updated to versions:
- 1.7.8 for Joomla 2.5
- 1.6.5 for Joomla 1.5
Web Design Seo
v.1.7.9 for Joomla 2.5 and 1.6.6 for Joomla 1.5 are released today. This version is only with bug fixes - for some cases of non-import pictures in K2.
Web Design Seo
Today Joomla Scraper is updated with this bug fix in aggregator cron.
Ivan Stamenov
Joomla Scraper Version 1.8.0 for Joomla 2.5 is here. It allows images without extension to be imported as well.
It does not implement the latest SimplePie parser yet.
Ivan Stamenov
Joomla Scraper Version 1.8.1 for Joomla 2.5 is now available.

It allows the latest SimplePie (ver. 1.3.1 - PHP 5.2) to be used in addition to the obsolete 1.1.2 (PHP 4).
One may choose which SimplePie version to use on a per RSS feed basis.
Important note: As the classes in both SimplePie versions have the same names, though, when using the "Import All" button, the first loaded SimplePie version will serve all the remaining feeds as well (regardless of their SimplePie version), because PHP does not allow a class (with the same name) to be re-declared.

Finally xAjax is gone as well.

If we keep the current pace, Scraper ver. 12574545.0.1 will be coming soon... smile.gif
Ivan Stamenov
For those of you, willing to update their SimplePie version to 1.3.1, please follow these steps:

1. Download the latest SimplePie: Click to view attachment;
2. Extract the archive and rename the contained file to simplepie.inc;
3. Upload it to /administrator/components/com_aggregator/inc/simplepie/ and overwrite the existing file.

If you are using PHP 5.2+, you are strongly encouraged to do so as this will get you rid of all these SimplePie "Deprecated: ..." warnings.


Can be differences in performance between parser versions!
Scraper version have 3 different parsers - 2 simplepie (old and new) and one custom parser. Different simplepie parsers have different performance also.

By default parser in "Joomla Scraper" is latest simplepie. We don't have checked and measured performance of old and new simplepie - may be some feeds will work better with old version of simplepie.

If you get "deprecated" or other error, use new version of simplepie parser.
Web Design Seo
Today is released bug fix for pagination in list with all feeds in component. Are changed many files, we can't post here this fix.

To receive update, please, send us email from email used in your order. These three users that have buy extension yesterday and today are already received update.
pavelKukov
Today, 2013-01-29 is released new version 1.8.4 of Aggregator-Scraper for Joomla. New version support some new great features.

1. Now is possible to preview imported items. Screenshot:



If you select one or more feeds and click on button "preview", you will see the result content that will be imported. This function is to test and fine tuning of rss feeds configuration.

2. You can limit number of imported articles. Screenshot:



If you write in field "feed limit" some number, for example 5, only first 5 items from this feed will be imported in joomla database.
Web Design Seo
Today is released version 1.8.5. Added one new function: keywords filter.



You have now filter to configure keywords - one or more. Item from feed will be imported only if some of keywords is find in title or in item body. This option allow you to filter results by keyword and to build better thematic websites. You can configure one feed many times with different keyword filters and to import results in suitable category.
Web Design Seo
Joomla Scraper v.1.8.6 for Joomla 3.0 is released.

Screenshots from Joomla3 version:






Цитат
Version for Joomla 3.0 support import only in content and K2!
pavelKukov
Now Joomla Scraper for Joomla 1.5 and 2.5 have improved content shuffling algorithm and new options for better control and automated manipulation. The new algorithm is more precise and produced texts are more readable. New options and settings are:

Shuffle sentences positions - Shuffle the positions of the sentences in the article.
Shuffle sentences - Shuffle the compound sentences using the given characters as a delimiter
Delimiters for sentences - Delimiters used for splitting text into sentences
Delimiters for sentence parts - Delimiters used for splitting sentence into parts
Punctuation characters - List of characters considered as punctuation and disallowed be at start or end of sentence
Minimum sentence length - Sentences shorter than n characters will be removed. Use this option for cleaning small errors like sentences only from names. Example Mr. Paul Kalkbrenner will be split to two sentences because of the dot after Mr.
Fragment shuffling - Fragment shuffling if allowed (if sentence or part of it contain more than n words and do not contain delimiters, it will be split in to phrases which will be shuffled randomly)
Shuffling long phrases - Long phrases prevention if fragment shuffling is allowed. If given text fragment is built up with less than required number words will not be affected from long phrase detection algorithm (with other words - if you want to shuffle small sentences you must set number here. )
Protected HTML tags - Comma separated list of HTML tags which to be protected when removing HTML. This gives you opportunity to keep some formatting.

Latest updated versions are: v.1.8.8 for Joomla 2.5 and v.1.6.7 for joomla 1.5.
pavelKukov
Today are released new versions of Joomla Scraper.

Now the new version (Joomla Scraper 1.6.6 for Joomla 1.5 and Joomla Scraper 1.8.9 for Joomla 2.5 ) detects time for import and memory usage. With this improved statistic you can make more - accurate assessment and not allow too large or slow feeds.

As you can see from screenshots, time and memory usage depends on items number, usage of scraper, synonyms replacement and content shuffle, image download and feed and site response speed.

Cost more memory (you need to increase memory_limit of php 64 or more mb, recommended over 128):
- large feeds with many items
- usage of scraper, synonyms replacement and content shuffle
- import of many feeds at once

Cost more time (you need to increase max_execution_time of php to 120 seconds or more):
- large feeds with many items
- download of images
- slow websites (response speed). Website that is opened from server in your country is fast, websites from other continents are slower.
- scraper and synonyms replacement


Feed preview without scraper and content shuffle took around second for feed with 40 items from yahoo.



Feed import without scraper and content shuffle took around 6 seconds for feed with 40 items from yahoo.



Feed import with scraper and content shuffle and image download took around 30 seconds for feed with 10 items from iTunes.



Import time for 93 items from feeds with different configuration took around 1.30 - 2 minutes. On most hosting accounts normal php configuration is "max_execution_time = 30" - 60 seconds and "memory_limit = 16M" - 32M megabytes.



Import from crontab with scraper, content shuffle and image download from site with slow response time took around one minute for only 14 items.



Цитат
Check your php settings and made needed adjustments over custom php.ini file or just place new support ticket to your host support and ask to change these php settings!
Web Design Seo
Now latest updates (better shuffler, time for import and memory usage) are available for Joomla 3.0 version also.
Web Design Seo
Joomla Scraper is updated again. Latest version of Joomla Scraper for Joomla 3.0 is v.1.8.7 (11 March 2013), for Joomla 2.5 is 1.9.0 (11 March 2013), for Joomla 1.5 is 1.6.9 (11.03.2013).

Latest realeases are with only one bug fixed: When using simplepie new parser in combination with feeds with non-utf-8 encoding some broken symbols are inserted in content.
pavelKukov
Цитат
Joomla have internal "duplicate content protection" based on titles of content items. But if you use spinner or synonyms replacement in titles, titles will be unique every time and Joomla's internal "duplicate content protection" will not work. Now in Joomla Scraper is added internal duplicate content protection based on links of imported items. In this way with or without to use spinning in titles, one article will be imported only once.


Today was released new version of Jooomla Scraper for Joomla 2.5+ and 3.0+ which adds protection from importing duplicate content. You can import one RSS feed as many times as you want, and no matter how often you try to import it, the same post will never be imported in your site twice. This is very helpfull when you use options to make content unique.

The function works as follows: Duplicate content protection is enabled by default and is based on the url address of the publication. The link is saved in database and from now on aggregator will know that this link is already imported. When re-import the same rss feed, news already imported will be skipped.

To prevent flooding the database with too many records (news links that are already imported are recorded in a separate row in the table), you have the option to activate automatic deletion after a certain period of time.

More extras: Ability to control the maximum execution time directly from the administration. This option will work for sure in joomla 3+ because joomla 3+ requires PHP 5.3+ . For Joomla 2.5 this option depends on server configuration.

Click to view attachment

Click to view attachment

Click to view attachment

Click to view attachment
pavelKukov
There was a bug in function cron import. It was found in both Joomla Scraper for Joomla 2.5 and 3.0. Now there is new versions available for download.

New versions are as follows:

for Joomla 2.5
com_aggregator_scraper-J25-1.9.2

for Joomla 3.0
com_aggregator_scraper-J30-1.8.9

for Joomla 1.5 (Component works but new version have some improvements in code)
com_aggregator_scraper-J15-1.7

Aggregator Platinum for Joomla 1.5 and 2.5 (Component works but new version have some improvements in code)
pavelKukov
Yesterday bug fix exposed a new bug today. The error is in "/administrator/components/com_aggregator/helpers/cron.php" around line 65. Error occurs when importing emission through cron and the emission is already imported. To fix it manualy you can do:

Find in "/administrator/components/com_aggregator/helpers/cron.php" around line 20

Код
function lTrimZeros($number) {
    while ($number[0]=='0') {
        $number = substr($number,1);
    }
    return $number;
}


And Replace It With:

Код
function lTrimZeros($number) {
        $number = (string)$number;
    while (!empty($number) && $number[0]=='0') {
        $number = substr($number,1);
    }
    return $number;
}


Find in "/administrator/components/com_aggregator/helpers/cron.php" around line 65

Код
if(!empty($matches))
{
                if (isset($matches[1]) && $matches[1]=="*") {
                    $matches[2] = 0;        // from
                    $matches[4] = $numberOfElements;        //to
                } elseif (isset($matches[4]) && isset($matches[2]) && $matches[4]=="") {
                    $matches[4] = $matches[2];
                }
                if (isset($matches[5]) && isset($matches[5][0]) && $matches[5][0]!="/") {
                    $matches[6] = 1;        // step
                }
                $matches[2] = (isset($matches[2]))?$matches[2]:"0";
                $matches[4] = (isset($matches[4]))?$matches[4]:"0";
                $matches[6] = (isset($matches[6]))?$matches[6]:"0";
                                $max_loops = 50;
                                $j = (int)((isset($matches[2]) && !empty($matches[2]))?lTrimZeros($matches[2]):0);
                                $max = (int)((isset($matches[4]) && !empty($matches[4]))?lTrimZeros($matches[4]):0);
                                $incr = (int)((isset($matches[6]) && !empty($matches[6]))?lTrimZeros($matches[6]):0);
                for ($j=$j;$j<=$max && $max_loops;$j+=$incr) {
                    $targetArray[$j] = TRUE;
                                        $max_loops--;
                }
            }


OR

Код
if (isset($matches[1]) && $matches[1]=="*") {
                    $matches[2] = 0;        // from
                    $matches[4] = $numberOfElements;        //to
                } elseif (isset($matches[4]) && isset($matches[2]) && $matches[4]=="") {
                    $matches[4] = $matches[2];
                }
                if (isset($matches[5]) && isset($matches[5][0]) && $matches[5][0]!="/") {
                    $matches[6] = 1;        // step
                }
    for ($j=lTrimZeros($matches[2]);$j<=lTrimZeros($matches[4]);$j+=lTrimZeros($matches[6])) {
        $targetArray[$j] = TRUE;
    }
}


OR

Код
if(!empty($matches))
            {
                if (isset($matches[1]) && $matches[1]=="*") {
                    $matches[2] = 0;        // from
                    $matches[4] = $numberOfElements;        //to
                } elseif (isset($matches[4]) && isset($matches[2]) && $matches[4]=="") {
                    $matches[4] = $matches[2];
                }
                if (isset($matches[5]) && isset($matches[5][0]) && $matches[5][0]!="/") {
                    $matches[6] = 1;        // step
                }
                $matches[2] = (isset($matches[2]))?$matches[2]:"0";
                $matches[4] = (isset($matches[4]))?$matches[4]:"0";
                $matches[6] = (isset($matches[6]))?$matches[6]:"0";
    for ($j=lTrimZeros($matches[2]);$j<=lTrimZeros($matches[4]);$j+=lTrimZeros($matches[6])) {
        $targetArray[$j] = TRUE;
    }
}


And Replace It With:

Код
if(!empty($matches))
            {
                if (isset($matches[1]) && $matches[1]=="*") {
                    $matches[2] = 0;        // from
                    $matches[4] = $numberOfElements;        //to
                } elseif (isset($matches[4]) && isset($matches[2]) && $matches[4]=="") {
                    $matches[4] = $matches[2];
                }
                if (isset($matches[5]) && isset($matches[5][0]) && $matches[5][0]!="/") {
                    $matches[6] = 1;        // step
                }
                $matches[2] = (isset($matches[2]))?$matches[2]:0;
                $matches[4] = (isset($matches[4]))?$matches[4]:0;
                $increment = (isset($matches[6]))?(int)lTrimZeros($matches[6]):1;
                $increment = max($increment,1);
                for ($j=(int)lTrimZeros($matches[2]);$j<=(int)lTrimZeros($matches[4]);$j+=$increment) {
                    $targetArray[$j] = TRUE;
                }
            }


Soon there will be available downloadable versions, packed with those bug fixes!
cromaplus
QUOTE (Web Design Seo @ Feb 8 2012, 03:30 PM) *
Latest version of Aggregator Platinum work perfect with Joomla 2.5. Is tested with Joomla 2.5.1.

hello thank you for answering, I have joomla 2.5.11 and the version of K2 k2 is v2.6.6 I hope you understand why it does not work, if I had seen before the other component scraper bought that instead of why I bought it I bed that is full support for k2

QUOTE (Web Design Seo @ May 12 2013, 03:46 PM) *
Aggregator platinum work perfect with K2. This is possible only if K2 team change something general in latest k2 version.

Please, post here your versions (joomla and k2 versions) and monday we will check your case.

hello thank you for answering, I have joomla 2.5.11 and the version of K2 k2 is v2.6.6 I hope you understand why it does not work, if I had seen before the other component scraper bought that instead of why I bought it I bed that is full support for k2
cromaplus
hello I also do not really get it to work fully with k2, do not understand how to import images automatically and have not yet found no help for it
cromaplus
I bought Joomla scraper but I can not import pictures do not understand why I see pictures in preview but after that there are k2
Web Design Seo
Now is available new versions of aggregator scraper for joomla 2.5 and joomla! 3.x

Improvement in this version is that now K2 image sizes are supported. Till now K2 images was just copied with different names, from now on they will be resized as follows:

Resize is based on bigger side of picture. Smaller images than target size will be just renamed. All sizes of tumbnails (s, m, l, xl and so...) are made with configured global sizes in K2 global config.

Only joomla 2.5 and joomla 3.0 versions are updated!

Important Note:
Цитат
This new functionality will increase resource consumption (RAM and Time). This increment is based on original image size. For each image are generated six new resized images. It is recommended to increase your max execution time and memory limits. You can disable resizing through component options.


We recommend to use this new function only if you are on powerfull server - vps or other.
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
Invision Power Board © 2001-2019 Invision Power Services, Inc.

Web Analytics