Welcome Guest ( Log In | Register )

 Forum Rules Joomla Scraper support
5 Pages V  < 1 2 3 4 5 >  
Reply to this topicStart new topic
> Joomla Scraper, Grabber For Joomla, Joomla Scraper Can Grab Any Content From Any Website
pavelKukov
post Feb 26 2013, 11:23 AM
Post #41


Php programmer
****

Group: Administrators
Posts: 285
Joined: 26-November 12
From: Bulgaria
Member No.: 1,452



Today are released new versions of Joomla Scraper.

Now the new version (Joomla Scraper 1.6.6 for Joomla 1.5 and Joomla Scraper 1.8.9 for Joomla 2.5 ) detects time for import and memory usage. With this improved statistic you can make more - accurate assessment and not allow too large or slow feeds.

As you can see from screenshots, time and memory usage depends on items number, usage of scraper, synonyms replacement and content shuffle, image download and feed and site response speed.

Cost more memory (you need to increase memory_limit of php 64 or more mb, recommended over 128):
- large feeds with many items
- usage of scraper, synonyms replacement and content shuffle
- import of many feeds at once

Cost more time (you need to increase max_execution_time of php to 120 seconds or more):
- large feeds with many items
- download of images
- slow websites (response speed). Website that is opened from server in your country is fast, websites from other continents are slower.
- scraper and synonyms replacement


Feed preview without scraper and content shuffle took around second for feed with 40 items from yahoo.



Feed import without scraper and content shuffle took around 6 seconds for feed with 40 items from yahoo.



Feed import with scraper and content shuffle and image download took around 30 seconds for feed with 10 items from iTunes.



Import time for 93 items from feeds with different configuration took around 1.30 - 2 minutes. On most hosting accounts normal php configuration is "max_execution_time = 30" - 60 seconds and "memory_limit = 16M" - 32M megabytes.



Import from crontab with scraper, content shuffle and image download from site with slow response time took around one minute for only 14 items.



Цитат
Check your php settings and made needed adjustments over custom php.ini file or just place new support ticket to your host support and ask to change these php settings!



--------------------
Php programmer in 3D Web Design
Go to the top of the page
 
+Quote Post
Web Design Seo
post Mar 8 2013, 08:51 AM
Post #42


Web Design Seo
****

Group: Root Admin
Posts: 4,156
Joined: 29-April 09
From: Sofia
Member No.: 1



Now latest updates (better shuffler, time for import and memory usage) are available for Joomla 3.0 version also.


--------------------
Правила на форума | Forum Rules | How to receive support. 3D Web Design: Уеб дизайн, Seo оптимизация, Web Site Extensions, Oscommerce Addons, Wordpress plugins and Joomla Extensions. Изработка на уеб сайтове и оптимизация на сайт за търсачки и Seo услуги.
Go to the top of the page
 
+Quote Post
Web Design Seo
post Mar 11 2013, 09:34 AM
Post #43


Web Design Seo
****

Group: Root Admin
Posts: 4,156
Joined: 29-April 09
From: Sofia
Member No.: 1



Joomla Scraper is updated again. Latest version of Joomla Scraper for Joomla 3.0 is v.1.8.7 (11 March 2013), for Joomla 2.5 is 1.9.0 (11 March 2013), for Joomla 1.5 is 1.6.9 (11.03.2013).

Latest realeases are with only one bug fixed: When using simplepie new parser in combination with feeds with non-utf-8 encoding some broken symbols are inserted in content.


--------------------
Правила на форума | Forum Rules | How to receive support. 3D Web Design: Уеб дизайн, Seo оптимизация, Web Site Extensions, Oscommerce Addons, Wordpress plugins and Joomla Extensions. Изработка на уеб сайтове и оптимизация на сайт за търсачки и Seo услуги.
Go to the top of the page
 
+Quote Post
pavelKukov
post Mar 19 2013, 01:31 PM
Post #44


Php programmer
****

Group: Administrators
Posts: 285
Joined: 26-November 12
From: Bulgaria
Member No.: 1,452



Цитат
Joomla have internal "duplicate content protection" based on titles of content items. But if you use spinner or synonyms replacement in titles, titles will be unique every time and Joomla's internal "duplicate content protection" will not work. Now in Joomla Scraper is added internal duplicate content protection based on links of imported items. In this way with or without to use spinning in titles, one article will be imported only once.


Today was released new version of Jooomla Scraper for Joomla 2.5+ and 3.0+ which adds protection from importing duplicate content. You can import one RSS feed as many times as you want, and no matter how often you try to import it, the same post will never be imported in your site twice. This is very helpfull when you use options to make content unique.

The function works as follows: Duplicate content protection is enabled by default and is based on the url address of the publication. The link is saved in database and from now on aggregator will know that this link is already imported. When re-import the same rss feed, news already imported will be skipped.

To prevent flooding the database with too many records (news links that are already imported are recorded in a separate row in the table), you have the option to activate automatic deletion after a certain period of time.

More extras: Ability to control the maximum execution time directly from the administration. This option will work for sure in joomla 3+ because joomla 3+ requires PHP 5.3+ . For Joomla 2.5 this option depends on server configuration.

Attached File  com_aggregator_scraper_J25_1.9.1_time_limit.png ( 50.12K ) Number of downloads: 10


Attached File  com_aggregator_scraper_J30_1.8.8_time_limit.png ( 66.55K ) Number of downloads: 11


Attached File  com_aggregator_scraper_J25_1.9.1_dublicate_protectiont.png ( 223.49K ) Number of downloads: 9


Attached File  com_aggregator_scraper_J30_1.8.8_dublicate_protection.png ( 152.01K ) Number of downloads: 6


--------------------
Php programmer in 3D Web Design
Go to the top of the page
 
+Quote Post
pavelKukov
post Apr 23 2013, 12:06 PM
Post #45


Php programmer
****

Group: Administrators
Posts: 285
Joined: 26-November 12
From: Bulgaria
Member No.: 1,452



There was a bug in function cron import. It was found in both Joomla Scraper for Joomla 2.5 and 3.0. Now there is new versions available for download.

New versions are as follows:

for Joomla 2.5
com_aggregator_scraper-J25-1.9.2

for Joomla 3.0
com_aggregator_scraper-J30-1.8.9

for Joomla 1.5 (Component works but new version have some improvements in code)
com_aggregator_scraper-J15-1.7

Aggregator Platinum for Joomla 1.5 and 2.5 (Component works but new version have some improvements in code)


--------------------
Php programmer in 3D Web Design
Go to the top of the page
 
+Quote Post
pavelKukov
post Apr 24 2013, 09:33 AM
Post #46


Php programmer
****

Group: Administrators
Posts: 285
Joined: 26-November 12
From: Bulgaria
Member No.: 1,452



Yesterday bug fix exposed a new bug today. The error is in "/administrator/components/com_aggregator/helpers/cron.php" around line 65. Error occurs when importing emission through cron and the emission is already imported. To fix it manualy you can do:

Find in "/administrator/components/com_aggregator/helpers/cron.php" around line 20

Код
function lTrimZeros($number) {
    while ($number[0]=='0') {
        $number = substr($number,1);
    }
    return $number;
}


And Replace It With:

Код
function lTrimZeros($number) {
        $number = (string)$number;
    while (!empty($number) && $number[0]=='0') {
        $number = substr($number,1);
    }
    return $number;
}


Find in "/administrator/components/com_aggregator/helpers/cron.php" around line 65

Код
if(!empty($matches))
{
                if (isset($matches[1]) && $matches[1]=="*") {
                    $matches[2] = 0;        // from
                    $matches[4] = $numberOfElements;        //to
                } elseif (isset($matches[4]) && isset($matches[2]) && $matches[4]=="") {
                    $matches[4] = $matches[2];
                }
                if (isset($matches[5]) && isset($matches[5][0]) && $matches[5][0]!="/") {
                    $matches[6] = 1;        // step
                }
                $matches[2] = (isset($matches[2]))?$matches[2]:"0";
                $matches[4] = (isset($matches[4]))?$matches[4]:"0";
                $matches[6] = (isset($matches[6]))?$matches[6]:"0";
                                $max_loops = 50;
                                $j = (int)((isset($matches[2]) && !empty($matches[2]))?lTrimZeros($matches[2]):0);
                                $max = (int)((isset($matches[4]) && !empty($matches[4]))?lTrimZeros($matches[4]):0);
                                $incr = (int)((isset($matches[6]) && !empty($matches[6]))?lTrimZeros($matches[6]):0);
                for ($j=$j;$j<=$max && $max_loops;$j+=$incr) {
                    $targetArray[$j] = TRUE;
                                        $max_loops--;
                }
            }


OR

Код
if (isset($matches[1]) && $matches[1]=="*") {
                    $matches[2] = 0;        // from
                    $matches[4] = $numberOfElements;        //to
                } elseif (isset($matches[4]) && isset($matches[2]) && $matches[4]=="") {
                    $matches[4] = $matches[2];
                }
                if (isset($matches[5]) && isset($matches[5][0]) && $matches[5][0]!="/") {
                    $matches[6] = 1;        // step
                }
    for ($j=lTrimZeros($matches[2]);$j<=lTrimZeros($matches[4]);$j+=lTrimZeros($matches[6])) {
        $targetArray[$j] = TRUE;
    }
}


OR

Код
if(!empty($matches))
            {
                if (isset($matches[1]) && $matches[1]=="*") {
                    $matches[2] = 0;        // from
                    $matches[4] = $numberOfElements;        //to
                } elseif (isset($matches[4]) && isset($matches[2]) && $matches[4]=="") {
                    $matches[4] = $matches[2];
                }
                if (isset($matches[5]) && isset($matches[5][0]) && $matches[5][0]!="/") {
                    $matches[6] = 1;        // step
                }
                $matches[2] = (isset($matches[2]))?$matches[2]:"0";
                $matches[4] = (isset($matches[4]))?$matches[4]:"0";
                $matches[6] = (isset($matches[6]))?$matches[6]:"0";
    for ($j=lTrimZeros($matches[2]);$j<=lTrimZeros($matches[4]);$j+=lTrimZeros($matches[6])) {
        $targetArray[$j] = TRUE;
    }
}


And Replace It With:

Код
if(!empty($matches))
            {
                if (isset($matches[1]) && $matches[1]=="*") {
                    $matches[2] = 0;        // from
                    $matches[4] = $numberOfElements;        //to
                } elseif (isset($matches[4]) && isset($matches[2]) && $matches[4]=="") {
                    $matches[4] = $matches[2];
                }
                if (isset($matches[5]) && isset($matches[5][0]) && $matches[5][0]!="/") {
                    $matches[6] = 1;        // step
                }
                $matches[2] = (isset($matches[2]))?$matches[2]:0;
                $matches[4] = (isset($matches[4]))?$matches[4]:0;
                $increment = (isset($matches[6]))?(int)lTrimZeros($matches[6]):1;
                $increment = max($increment,1);
                for ($j=(int)lTrimZeros($matches[2]);$j<=(int)lTrimZeros($matches[4]);$j+=$increment) {
                    $targetArray[$j] = TRUE;
                }
            }


Soon there will be available downloadable versions, packed with those bug fixes!


--------------------
Php programmer in 3D Web Design
Go to the top of the page
 
+Quote Post
cromaplus
post May 12 2013, 05:01 PM
Post #47


Newbie
*

Group: Members
Posts: 5
Joined: 12-May 13
Member No.: 1,684



QUOTE (Web Design Seo @ Feb 8 2012, 03:30 PM) *
Latest version of Aggregator Platinum work perfect with Joomla 2.5. Is tested with Joomla 2.5.1.

hello thank you for answering, I have joomla 2.5.11 and the version of K2 k2 is v2.6.6 I hope you understand why it does not work, if I had seen before the other component scraper bought that instead of why I bought it I bed that is full support for k2

QUOTE (Web Design Seo @ May 12 2013, 03:46 PM) *
Aggregator platinum work perfect with K2. This is possible only if K2 team change something general in latest k2 version.

Please, post here your versions (joomla and k2 versions) and monday we will check your case.

hello thank you for answering, I have joomla 2.5.11 and the version of K2 k2 is v2.6.6 I hope you understand why it does not work, if I had seen before the other component scraper bought that instead of why I bought it I bed that is full support for k2
Go to the top of the page
 
+Quote Post
cromaplus
post Jun 10 2013, 01:48 PM
Post #48


Newbie
*

Group: Members
Posts: 5
Joined: 12-May 13
Member No.: 1,684



hello I also do not really get it to work fully with k2, do not understand how to import images automatically and have not yet found no help for it


This post has been edited by cromaplus: Jun 10 2013, 01:49 PM
Go to the top of the page
 
+Quote Post
cromaplus
post Jun 10 2013, 03:21 PM
Post #49


Newbie
*

Group: Members
Posts: 5
Joined: 12-May 13
Member No.: 1,684



I bought Joomla scraper but I can not import pictures do not understand why I see pictures in preview but after that there are k2
Go to the top of the page
 
+Quote Post
Web Design Seo
post Jun 11 2013, 06:22 AM
Post #50


Web Design Seo
****

Group: Root Admin
Posts: 4,156
Joined: 29-April 09
From: Sofia
Member No.: 1



Now is available new versions of aggregator scraper for joomla 2.5 and joomla! 3.x

Improvement in this version is that now K2 image sizes are supported. Till now K2 images was just copied with different names, from now on they will be resized as follows:

Resize is based on bigger side of picture. Smaller images than target size will be just renamed. All sizes of tumbnails (s, m, l, xl and so...) are made with configured global sizes in K2 global config.

Only joomla 2.5 and joomla 3.0 versions are updated!

Important Note:
Цитат
This new functionality will increase resource consumption (RAM and Time). This increment is based on original image size. For each image are generated six new resized images. It is recommended to increase your max execution time and memory limits. You can disable resizing through component options.


We recommend to use this new function only if you are on powerfull server - vps or other.


--------------------
Правила на форума | Forum Rules | How to receive support. 3D Web Design: Уеб дизайн, Seo оптимизация, Web Site Extensions, Oscommerce Addons, Wordpress plugins and Joomla Extensions. Изработка на уеб сайтове и оптимизация на сайт за търсачки и Seo услуги.
Go to the top of the page
 
+Quote Post
pavelKukov
post Jun 25 2013, 09:54 AM
Post #51


Php programmer
****

Group: Administrators
Posts: 285
Joined: 26-November 12
From: Bulgaria
Member No.: 1,452



Soon will be available new downloadable version of "aggregator_scraper" for joomla 3.+

Version number for Joomla 3 now is 1.9.1. Is updated only Joomla 3 version!

We have found minor bug in previous versions of "aggregator_scraper" for joomla 3.+. Bug affects only some feeds, when for parsing is used old and new "SimplePie parser". Bug is expressed in that affected feeds was skipped and their content looks blank, but is not.

This bug is fixed in version 1.9.1.

It is recommended to upgrade your version of "aggregator_scraper" for joomla 3.+. Latest version is tested and working with both latest joomla 3 versions - is tested with joomla 3.0.3 and with joomla 3.1.1.


--------------------
Php programmer in 3D Web Design
Go to the top of the page
 
+Quote Post
pavelKukov
post Dec 9 2013, 02:53 PM
Post #52


Php programmer
****

Group: Administrators
Posts: 285
Joined: 26-November 12
From: Bulgaria
Member No.: 1,452



Update for joomla 2.5 and 3.x users!

New versions are named:
For joomla 2.5 - 1.9.5
For joomla 3.x - 1.9.2

What's new?

New option named "Remove Script Tags". With this new option you can decide whenever to remove or keep script tags from scraped html. This is usefull when you are trying to import content which relies on javascript.

Fixed minor error with image url's. In rare cases when you have one image repeated multiple times in code, then this image url becomes invalid in previous versions of aggregator_scraper.

Screenshot of new option:

Attached File  joomla_3_remove_script.png ( 26.1K ) Number of downloads: 6


This update is highly recommended for users that are trying to import content which relies on javascript!


--------------------
Php programmer in 3D Web Design
Go to the top of the page
 
+Quote Post
Web Design Seo
post Jan 17 2014, 09:52 AM
Post #53


Web Design Seo
****

Group: Root Admin
Posts: 4,156
Joined: 29-April 09
From: Sofia
Member No.: 1



Today Joomla Scraper is updated again - is fixed bug in both Joomla 3 and Joomla 2.5 versions - bug is with K2 importer and latest K2 version (k2Table missing).


--------------------
Правила на форума | Forum Rules | How to receive support. 3D Web Design: Уеб дизайн, Seo оптимизация, Web Site Extensions, Oscommerce Addons, Wordpress plugins and Joomla Extensions. Изработка на уеб сайтове и оптимизация на сайт за търсачки и Seo услуги.
Go to the top of the page
 
+Quote Post
Web Design Seo
post Jan 20 2014, 12:47 PM
Post #54


Web Design Seo
****

Group: Root Admin
Posts: 4,156
Joined: 29-April 09
From: Sofia
Member No.: 1



Today Joomla Scraper for Joomla 3 is updated again - is improved compatibility with Joomla 3.2.


--------------------
Правила на форума | Forum Rules | How to receive support. 3D Web Design: Уеб дизайн, Seo оптимизация, Web Site Extensions, Oscommerce Addons, Wordpress plugins and Joomla Extensions. Изработка на уеб сайтове и оптимизация на сайт за търсачки и Seo услуги.
Go to the top of the page
 
+Quote Post
pavelKukov
post Jan 28 2014, 09:56 AM
Post #55


Php programmer
****

Group: Administrators
Posts: 285
Joined: 26-November 12
From: Bulgaria
Member No.: 1,452



Today is fixed bug (sql error while sending Email Report) in both Scraper versions.
Update is available for both version for joomla 2.5.x and joomla 3.x.


--------------------
Php programmer in 3D Web Design
Go to the top of the page
 
+Quote Post
Web Design Seo
post Feb 24 2014, 01:44 PM
Post #56


Web Design Seo
****

Group: Root Admin
Posts: 4,156
Joined: 29-April 09
From: Sofia
Member No.: 1



New version for Joomla 2.5 is released - Joomla Scraper v.1.9.6. Is added newest Kunena versions support - now Joomla Scraper for Joomla 2.5 support Kunena 3.

Only version for Joomla 2.5 is updated!


--------------------
Правила на форума | Forum Rules | How to receive support. 3D Web Design: Уеб дизайн, Seo оптимизация, Web Site Extensions, Oscommerce Addons, Wordpress plugins and Joomla Extensions. Изработка на уеб сайтове и оптимизация на сайт за търсачки и Seo услуги.
Go to the top of the page
 
+Quote Post
pavelKukov
post Feb 25 2014, 10:20 AM
Post #57


Php programmer
****

Group: Administrators
Posts: 285
Joined: 26-November 12
From: Bulgaria
Member No.: 1,452



New version for Joomla 3.x is released - Joomla Scraper v.1.9.4.
Is added support for newest Kunena versions. Now Joomla Scraper for Joomla 3.x support Kunena 3.

Attached File  aggregator_194_kunena.png ( 19.57K ) Number of downloads: 3


--------------------
Php programmer in 3D Web Design
Go to the top of the page
 
+Quote Post
ataman79
post Mar 20 2014, 10:13 AM
Post #58


Newbie
*

Group: Members
Posts: 32
Joined: 26-November 10
Member No.: 399



Is it possible to grab a certain number of characters from a certain new ?

For example I set my scraper to grab the full image and the whole text from the new. But I want to make it not to grab the full text but certain characters from it.
for example starting string is: <div class="round"> , and ending string (including the text) is <div class="extra">. The tags inside the text area are only <p> or </p>

So is this possible ?

Thanks in advance
Go to the top of the page
 
+Quote Post
Web Design Seo
post Mar 20 2014, 10:49 AM
Post #59


Web Design Seo
****

Group: Root Admin
Posts: 4,156
Joined: 29-April 09
From: Sofia
Member No.: 1



Is possible only with combination of both functions: Strip HTML and Introtext Lenght. But content will be without HTML - as clear text.



--------------------
Правила на форума | Forum Rules | How to receive support. 3D Web Design: Уеб дизайн, Seo оптимизация, Web Site Extensions, Oscommerce Addons, Wordpress plugins and Joomla Extensions. Изработка на уеб сайтове и оптимизация на сайт за търсачки и Seo услуги.
Go to the top of the page
 
+Quote Post
ataman79
post Mar 31 2014, 01:10 PM
Post #60


Newbie
*

Group: Members
Posts: 32
Joined: 26-November 10
Member No.: 399



QUOTE (Web Design Seo @ Mar 20 2014, 10:49 AM) *
Is possible only with combination of both functions: Strip HTML and Introtext Lenght. But content will be without HTML - as clear text.


Hi again,
ok in the tab scraper I set starting string and ending string

After that from your answer and the attached picture, i set the next options in the tab Publish:
Strip content HTML tags - Yes
Strip title HTML tags - Yes
Strip special chars in title - Yes

The option Allowed HTML tags <img><strong><p><br/><br> should I clear it or leave it as it's by default ?

Actually I want not to grab the whole new , but a part of it (actually the normal feed is enough form me, but in that case the picture is small) That's why I want to grab the new with the original image, but with less text.

Thanks in advance

Go to the top of the page
 
+Quote Post

5 Pages V  < 1 2 3 4 5 >
Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

Collapse

> Similar Topics

  Topic Replies Topic Starter Views Last Action
No new Pinned: Topic has attachmentsJoomla Vulnerable Extensions List
list with new Joomla exploits
20 Web Design Seo 23,001 26th September 2018 - 05:07 AM
Last post by: Web Design Seo
No new Pinned: Joomla Pagination Seo Plugin
SEO plugin for Joomla Pagination that work in all Joomla
61 Web Design Seo 62,044 13th March 2018 - 10:05 AM
Last post by: mxcpz
No New Posts Pinned: Joomla Scraper Going Open Source
No licenses, use scraper on unlimited number of web sites
0 Web Design Seo 5,735 8th March 2017 - 07:40 AM
Last post by: Web Design Seo
No New Posts Pinned: Joomla Ден 2016
Joomla Day 2016
1 Web Design Seo 18,474 31st October 2016 - 10:11 AM
Last post by: Web Design Seo
No New Posts Joomla Post By Email To K2 Extra Fields
1 uglykidjoe 13,173 11th February 2016 - 07:45 AM
Last post by: Web Design Seo
No new Pinned: Joomla Scraper Integration With K2
better integration of Joomla Scraper and K2
8 Web Design Seo 24,517 2nd January 2016 - 09:07 AM
Last post by: b_goranov
No New Posts Joomla Scraper Not Working
2 Jan Hädicke 7,754 11th November 2015 - 02:05 PM
Last post by: Web Design Seo
No New Posts Joomla Scrapper Subscription And Licensing Model
3 omoj 8,322 15th October 2014 - 06:06 AM
Last post by: Web Design Seo
No New Posts Joomla Developer For Hire
Hire Joomla! developers
0 Web Design Seo 8,231 25th July 2014 - 08:00 AM
Last post by: Web Design Seo
No New Posts Pinned: Joomla Web Development From Joomla Developer
web development of joomla extensions from experienced developers
0 Web Design Seo 8,196 14th May 2014 - 01:07 PM
Last post by: Web Design Seo


 



RSS Lo-Fi Version Time is now: 26th August 2019 - 05:35 AM
Clicky Web Analytics