![]() ![]() |
Jun 29 2011, 08:22 AM
Post
#1
|
|
![]() Web Design Seo ![]() ![]() ![]() ![]() Group: Root Admin Posts: 4,332 Joined: 29-April 09 From: Sofia Member No.: 1 |
How to configure scraper? 1. Configure rss feed. 2. Open Scraper part and switch ON scraper part of aggregator: Enable - YES. ![]() 3. Open one content item in website where point links from rss feed. For example rss feed: Код http://feeds.nytimes.com/nyt/rss/HomePage Example content item that is in this rss feed: Код http://www.nytimes.com/2011/06/30/world/asia/30afghanistan.html 4. Right click on page and click View Source to see html code of this page. 5. You must find some string in html code to set up Starting string. For this page this is: Код <nyt_correction_top> Start and end tags must be unique - must be in html code only once! Or scraper will grab content from first occurence where tags are found. Is not required to be html tag, you must configure some exact string from page. If you do't know html, don't worry! This must be just some unique string, so, extension will work also with: Код <nyt_correction_top or with other non-closed html tag. 6. You must find some string in html code to set up End string. For this page this is: Код <nyt_author_id> or Код <div id="pageLinks"> or Код <div id="pageLinks Is not required to be valid html tag, you must configure some exact string. Is recommended to be unique string! Ready. Test scraper and grab some content. If you have configured the right tags, scraper will work. All content between start tag and end tag will be imported. P. S. If you dont want to strip html, option "Strip html Tags" must be set to: No. Image Relative URL prefix is only for websites that are with relative paths to pictures in content. If Website is with relative paths to pictures, just enter in this field domain from where you grab content: Код http://web-site.com/ 7. Image Relative URL prefix Option must be used only if in source code path to images is not full, example: Код <img src="directory/picture.jpg"> In this case you can add "Image Relative URL prefix": Код http://www.bbc.co.uk/ and in your site picture will be shown from: Код http://www.bbc.co.uk/directory/picture.jpg 8. "Permalink - search for" and "Permalink - replace with" Option must be used only if end urls of content articles is different than urls in rss feed. Example: if url in feed is: Код http://www.feeds.bbc.co.uk/news/uk-england-london-18781322 and end url is: Код http://www.bbc.co.uk/news/uk-england-london-18781322 "Permalink - search for" must be: Код http://www.feeds.bbc.co.uk and "Permalink - replace with" must be: Код http://www.bbc.co.uk When will not work Scraper? - If you don't have configure starting string and end string - Starting string and end string must be unique, you must find on page string in html code that is unique. If string is not unique, scraper will grab content from page from first place where this string is present - Scraper will not work if link to full content item in rss feed is not the same like the real content item url. "Permalink - search for" and "Permalink - replace with" fileds are the way to solve this, but work only in some cases. Цитат For some rss feeds that use system of redirects and special protection is not possible to grab full content! There is no warranty that scraper will work with every feed! -------------------- Правила на форума | Forum Rules | How to receive support. 3D Web Design: Уеб дизайн, Seo оптимизация, Web Site Extensions, Oscommerce Addons, Wordpress plugins and Joomla Extensions. Изработка на уеб сайтове и оптимизация на сайт за търсачки и Seo услуги.
|
|
|
|
Oct 3 2011, 02:27 PM
Post
#2
|
|
|
Newbie ![]() Group: Members Posts: 32 Joined: 26-November 10 Member No.: 399 |
Hello ,
I'm using the scrapper version and I'm interesting when I grab rss (with or without turned on Scrapper function) how can I remove some tags from the grabbed news? Something like <img> tag , or other tags? |thank you in advance |
|
|
|
Oct 3 2011, 02:42 PM
Post
#3
|
|
![]() Web Design Seo ![]() ![]() ![]() ![]() Group: Root Admin Posts: 4,332 Joined: 29-April 09 From: Sofia Member No.: 1 |
Hello. You can do this with use of fields "Strip html tags" and "Allowed html tags". See on screen within first post.
-------------------- Правила на форума | Forum Rules | How to receive support. 3D Web Design: Уеб дизайн, Seo оптимизация, Web Site Extensions, Oscommerce Addons, Wordpress plugins and Joomla Extensions. Изработка на уеб сайтове и оптимизация на сайт за търсачки и Seo услуги.
|
|
|
|
Dec 14 2011, 01:47 PM
Post
#4
|
|
|
Newbie ![]() Group: Members Posts: 1 Joined: 14-December 11 Member No.: 999 |
Can you please give us a newer example??cause i am trying to follow this one and it seems to be different as a result i can make my scraper work proper..
thanks in advanced Jono |
|
|
|
Dec 14 2011, 02:44 PM
Post
#5
|
|
![]() Web Design Seo ![]() ![]() ![]() ![]() Group: Root Admin Posts: 4,332 Joined: 29-April 09 From: Sofia Member No.: 1 |
There is no news about these functions. What type of example you need?
In latest Scraper you can configure two or more start strings and end strings in this way: Код <div class="content">|<other_tag class="other_class"> In field "Starting string" now you can enter many html tags, enter every next with separator | -------------------- Правила на форума | Forum Rules | How to receive support. 3D Web Design: Уеб дизайн, Seo оптимизация, Web Site Extensions, Oscommerce Addons, Wordpress plugins and Joomla Extensions. Изработка на уеб сайтове и оптимизация на сайт за търсачки и Seo услуги.
|
|
|
|
May 17 2013, 07:45 AM
Post
#6
|
|
|
Newbie ![]() Group: Members Posts: 5 Joined: 12-May 13 Member No.: 1,684 |
You must configure scraper if scraper function is enabled. Is not enough only to switch on. How to configure scraper? 1. Configure rss feed. 2. Open Scraper part and switch ON scraper part of aggregator: Enable - YES. ![]() 3. Open one content item in website where point links from rss feed. For example rss feed: CODE http://feeds.nytimes.com/nyt/rss/HomePage Example content item that is in this rss feed: CODE http://www.nytimes.com/2011/06/30/world/asia/30afghanistan.html 4. Right click on page and click View Source to see html code of this page. 5. You must find some string in html code to set up Starting string. For this page this is: CODE <nyt_correction_top> This tag must be unique - must be in html code only once! Is not required to be html tag, you must configure some exact string. 6. You must find some string in html code to set up End string. For this page this is: CODE <nyt_author_id> Is not required to be valid html tag, you must configure some exact string. Is recommended to be unique string! Ready. Test scraper and grab some content. If you have configured the right tags, scraper will work. All content between start tag and end tag will be imported. P. S. If you dont want to strip html, option "Strip html Tags" must be set to: No. Image Relative URL prefix is only for websites that are with relative paths to pictures in content. If Website is with relative paths to pictures, just enter in this field domain from where you grab content: CODE http://web-site.com/ 7. Image Relative URL prefix Option must be used only if in source code path to images is not full, example: CODE <img src="directory/picture.jpg"> In this case you can add "Image Relative URL prefix": CODE http://www.bbc.co.uk/ and in your site picture will be shown from: CODE http://www.bbc.co.uk/directory/picture.jpg 8. "Permalink - search for" and "Permalink - replace with" Option must be used only if end urls of content articles is different than urls in rss feed. Example: if url in feed is: CODE http://www.feeds.bbc.co.uk/news/uk-england-london-18781322 and end url is: CODE http://www.bbc.co.uk/news/uk-england-london-18781322 "Permalink - search for" must be: CODE http://www.feeds.bbc.co.uk and "Permalink - replace with" must be: CODE http://www.bbc.co.uk When will not work Scraper? - If you don't have configure starting string and end string - Starting string and end string must be unique, you must find on page string in html code that is unique. If string is not unique, scraper will grab content from page from first place where this string is present - Scraper will not work if link to full content item in rss feed is not the same like the real content item url. "Permalink - search for" and "Permalink - replace with" fileds are the way to solve this, but work only in some cases. Good morning everyone, I am trying to insert the news in the component k2, the preview I see the photos and the text correctly, the amount of news but I see it in the photo k2 as well as in the attached picture do not understand where I'm wrong, someone tell me how fix the problem?
|
|
|
|
![]() ![]() |
Similar Topics
| Topic | Replies | Topic Starter | Views | Last Action | |
|---|---|---|---|---|---|
![]() |
Pinned: How To Optimize Speed And Server Load For Joomla Website? How to reduce server load for Joomla site |
0 | Web Design Seo | 66,204 | 15th January 2020 - 09:17 AM Last post by: Web Design Seo |
![]() |
How To Block Bad Bots On My Website .htaccess Code code for .htaccess |
3 | Web Design Seo | 79,806 | 15th January 2020 - 07:46 AM Last post by: Web Design Seo |
![]() |
Pinned: How To Check Aged Domain Names Quality old domain for sale - check quality first! |
3 | Web Design Seo | 99,024 | 10th September 2018 - 06:22 AM Last post by: Web Design Seo |
![]() |
Pinned: The Best Dropped Domains Catcher the best drop catcher for expired domains |
0 | Web Design Seo | 49,040 | 30th October 2017 - 02:50 PM Last post by: Web Design Seo |
![]() |
How To Boost Onpage Seo Of My Website fast and power way to find the right keywords and to rank it quickly |
0 | Web Design Seo | 77,090 | 16th January 2017 - 01:47 PM Last post by: Web Design Seo |
![]() |
How To Limit Number Of Items In Rss Feed With Aggregator yahoo pipes guide and example code |
1 | Web Design Seo | 290,454 | 7th September 2015 - 10:48 AM Last post by: Web Design Seo |
![]() |
Pinned: How To Use Majestic For Free Majestic.com for free, how-to guide |
0 | Web Design Seo | 49,182 | 11th March 2015 - 02:59 PM Last post by: Web Design Seo |
![]() |
Pinned: How To Protect From Negative Seo How to fight with Negative SEO campaigns, Short Guide |
1 | Web Design Seo | 64,971 | 26th September 2014 - 01:31 PM Last post by: Web Design Seo |
![]() |
Pinned: How To Link After Google Penguin post penguin link building |
2 | Web Design Seo | 71,700 | 25th July 2014 - 10:21 AM Last post by: Web Design Seo |
![]() |
How Do I Know If Mbstring Installed? Requirements of Post by Email |
2 | April Pastis | 22,122 | 5th May 2014 - 03:30 PM Last post by: April Pastis |
|
Lo-Fi Version | Time is now: 1st June 2026 - 09:31 PM |