Welcome Guest ( Log In | Register )

 Forum Rules Joomla Scraper support
 
Reply to this topicStart new topic
> Scrape Won't Work As Expected
mkrokos
post Sep 11 2013, 11:30 AM
Post #1


Newbie
*

Group: Members
Posts: 12
Joined: 9-September 13
Member No.: 1,821



I tried to enable Scraper to import full article and full size image. Nevertheless, I am getting exactly the same results as though Scraper is not enabled. Just the RSS feed content.

I think I have set up Scraper correctly according to the documentation. Here is an example:

The link of the remote article

The part of the source which is of interest is this:
CODE
<div class="postphoto">
                        <img class="enlargeImg" src="http://air.news.gr/cov/63/6352_MARIANTA_PIERIDI_29082013_b2.jpg" width="460" alt="Πέντε πράγματα που δεν ξέρετε για την Μαριάντα Πιερίδη" /><span class="enlarge close"></span>
                    </div>
                    
                    <div class="posttext">
                        <p>Η Μαριάντα Πιερίδη κατά καιρούς αποκαλύπτει διάφορα πράγματα για τον εαυτό της.</p><p>Ωστόσο, υπάρχουν και κάποια άλλα που δεν είναι και τόσο γνωστά στο ευρύ κοινό.</p><p>Η τραγουδίστρια εκνευρίζεται με την ατάκα «να 'σαι καλά» και όταν την αγγίζουν στην πλάτη.</p><p>Οι παιδικοί της ήρωες ήταν οι Thundercats και για φτιάξει την διάθεσή της περπατάει μέχρι τελικής πτώσης.</p><p>Τέλος, της αρέσει να συλλέγει φλιτζάνια από όλο τον κόσμο, ενώ ο τυχερός της αριθμός είναι το 13, όπως αναφέρει το περιοδικό ΕΓΩ.</p>
                        

                        
                    </div>
                    
<div class="likesponsor">


I used for starting tag the <div class="postphoto">
For ending tag, I used the <div class="likesponsor"> because, -although it is not relevant to the content and I don't want to import it- it's unique and I don't think would work since there are a hundred </div> tags later on.
I even added the <div> tag in the allowed HTML tags.

What am I making wrong?

PS: I am using the K2 plugin


This post has been edited by mkrokos: Sep 11 2013, 11:32 AM
Go to the top of the page
 
+Quote Post
Web Design Seo
post Sep 11 2013, 12:18 PM
Post #2


Web Design Seo
****

Group: Root Admin
Posts: 4,332
Joined: 29-April 09
From: Sofia
Member No.: 1



The How To Configure Scraper guide.

Some important things inside guide:
1. The tag must be unique - must be in html code only once!
2. Can be some string like non-closed html tag, like this:
Код
<div id="pageLinks



If you want, you can configure scraper to search for 2 or more strings. Use this format for example:
Код
<div id="pageLinks|div id="other string|div id= string 3


In this way, if scraper not find first string, will search for string 2 and so....


--------------------
Правила на форума | Forum Rules | How to receive support. 3D Web Design: Уеб дизайн, Seo оптимизация, Web Site Extensions, Oscommerce Addons, Wordpress plugins and Joomla Extensions. Изработка на уеб сайтове и оптимизация на сайт за търсачки и Seo услуги.
Go to the top of the page
 
+Quote Post
mkrokos
post Sep 11 2013, 01:41 PM
Post #3


Newbie
*

Group: Members
Posts: 12
Joined: 9-September 13
Member No.: 1,821



Цитат(Web Design Seo @ Sep 11 2013, 03:18 PM) *
The How To Configure Scraper guide.

Some important things inside guide:
1. The tag must be unique - must be in html code only once!
2. Can be some string like non-closed html tag, like this:
Код
<div id="pageLinks



If you want, you can configure scraper to search for 2 or more strings. Use this format for example:
Код
<div id="pageLinks|div id="other string|div id= string 3


In this way, if scraper not find first string, will search for string 2 and so....


I changed the string by removing the ending characters "> but still no cigar sad.gif

I made sure that both of these tags are unique throughout the page source.

Can you give this a try and let me know if it works for you? The feed is this: http://www.news.gr/rss.ashx?catid=15
Go to the top of the page
 
+Quote Post
mkrokos
post Sep 11 2013, 02:55 PM
Post #4


Newbie
*

Group: Members
Posts: 12
Joined: 9-September 13
Member No.: 1,821



OK, I changed the feed and now it worked OK. Still trying to figure out how to insert the correct article image in the K2 image field.

Also, trying to figure out how to resize images to fit properly in my website template.
Go to the top of the page
 
+Quote Post
mkrokos
post Sep 11 2013, 05:51 PM
Post #5


Newbie
*

Group: Members
Posts: 12
Joined: 9-September 13
Member No.: 1,821



Two things I need your help with...
  1. Whenever I use just one starting and ending tag, the content is properly inserted in the K2 item. But since certain things are kept separated in the source page (ie, the article default image is somewhere at the top while the body with the rest of the images is further down), I tried to use a pair -or more- of starting/ending tags, separated by a pipe symbol. In every case, the Scrape failed ending up in just the newsfeed content. I am sure I used unique tags. What I am not sure is what is inserted and what not. For example, if I want to import the image from this line
    CODE
      <div class="article_sidebar"><a class="thickbox" title="Τέτοιο δικέφαλο δεν έχετε ξαναδεί" href="/files/temp/94F9DC413187BA1F7027AE28FECC37AC.jpg"><img rel="/files/temp/43DFBD153BA593850A32EAEA9A644934.jpg" title="Τέτοιο δικέφαλο δεν έχετε ξαναδεί" alt="Τέτοιο δικέφαλο δεν έχετε ξαναδεί" class="article_photo" src="/files/temp/43DFBD153BA593850A32EAEA9A644934.jpg"><span title="Μεγέθυνση φωτογραφίας" class="magnifier"> </span></a>

    (not the link but just the image inside the img tag) what do i use for starting/ending tag? The '<div class="article_sidebar' string is unique as well as the '<a class="thickbox'. Which one should be used for a starting tag? And would I use the '<span title="Μεγέθυνση' for its corresponding ending tag?
  2. Also, I have a hard time configuring the introtext. I have tried all options by I either end up with no introtext or I get the proper introtext in the category view, but when I click on the read more to see the whole article, I have an hr readmore tag splitting a paragraph abruptly.


Please, help me out since it seems that documentation is rather limited to cover all posibilities.
Go to the top of the page
 
+Quote Post
Web Design Seo
post Sep 12 2013, 06:31 AM
Post #6


Web Design Seo
****

Group: Root Admin
Posts: 4,332
Joined: 29-April 09
From: Sofia
Member No.: 1



Цитат
Please, help me out since it seems that documentation is rather limited to cover all posibilities.


Yes, is hard to define in details more than 205 functions inside only one component. Joomla Scraper is one real swiss army knife smile.gif

1. If content and picture are not "all together" in one container, you must grab all content - from start to end. (Start will be picture or content - this one what is first in html code, from where you want to grab content). If you need to strip something inside this content, you can configure it in field "strip content between tags"



2. About introtext. In Scraper documentation is written:

Цитат
Attention: "Split introtext: After N chars" function work well only with clear text!


This is because "After N chars" function can't recognize is content in some container (there can be inside many levels of html tags into each other), in what container and to close all tags automatically. So, is better to use other options for Split introtext.

3. About pictures in K2: Resize images function is inside K2, not inside Joomla Scraper. You can configure sizes from K2.


--------------------
Правила на форума | Forum Rules | How to receive support. 3D Web Design: Уеб дизайн, Seo оптимизация, Web Site Extensions, Oscommerce Addons, Wordpress plugins and Joomla Extensions. Изработка на уеб сайтове и оптимизация на сайт за търсачки и Seo услуги.
Go to the top of the page
 
+Quote Post
mkrokos
post Sep 12 2013, 07:48 AM
Post #7


Newbie
*

Group: Members
Posts: 12
Joined: 9-September 13
Member No.: 1,821



Yes, it's true that it has a sh*tloadd of functions that they are not documented at all. For example, the Content SEO and more. It would be nice to have a full guide. smile.gif

  1. I realised that I had to grab the whole content part and start stripping through trial and error and it worked like a charm except the instances where a div would be nice to be included at some point but not in general. Oh well... can't have it all, right?
  2. Still, I am not sure how the After "Before Item" and Before "After Item" work. Could you pls explain a bit? Ideally, it would be nice to have some introtext in the category view, and then have the full text in the item view without the readmore hr tag inserted in the text. How to do that?
  3. Yes, you are right. I fixed the maximum_width thing in K2 CSS and now it's OK!


Thanks!
Go to the top of the page
 
+Quote Post
ataman79
post Oct 28 2013, 10:22 AM
Post #8


Newbie
*

Group: Members
Posts: 32
Joined: 26-November 10
Member No.: 399



OK I can not understand the same, what does this mean ?
About the Intro text - After "Before Item" and Before "After Item" . Where should I get the intro text in both situation ?

Can someone explain ?
Go to the top of the page
 
+Quote Post
Web Design Seo
post Oct 28 2013, 11:53 AM
Post #9


Web Design Seo
****

Group: Root Admin
Posts: 4,332
Joined: 29-April 09
From: Sofia
Member No.: 1



https://3dwebdesign.org/forum/joomla-scrape...for-joomla-t698



Split introtext option: This function insert "Read more" tag at different places of content item:
- No intro text ("Read more" tag is inserted at start of content item)
- After "before item" ( "Read more" tag is inserted after content item and before content that is inserted from "before item" field of "Content seo")
- Before "After item" - After content but before "after item" content.
- Only introtext (all text inserted from feed is inside fulltext)

"After item" and "before item" are fields in tab Content Seo in options. So, what you put inside these fileds will be before and after original content.



--------------------
Правила на форума | Forum Rules | How to receive support. 3D Web Design: Уеб дизайн, Seo оптимизация, Web Site Extensions, Oscommerce Addons, Wordpress plugins and Joomla Extensions. Изработка на уеб сайтове и оптимизация на сайт за търсачки и Seo услуги.
Go to the top of the page
 
+Quote Post

Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

Collapse

> Similar Topics

  Topic Replies Topic Starter Views Last Action
No new Works On Demo Site But Not On My Site
12 imilanfc 19,173 26th November 2013 - 10:08 AM
Last post by: pavelKukov
No new Scraper Do Not Import .webp Image Extension
5 Juan Vicente Pascual 53,814 16th May 2013 - 06:59 AM
Last post by: Juan Vicente Pascual
No New Posts Work With Cache
1 FireFly3000 15,112 8th January 2013 - 07:23 AM
Last post by: Web Design Seo
No New Posts Scraper Update For Aggregator Platinum
plugin for Aggregator Platinum that add grabber functions
1 Web Design Seo 11,523 13th June 2011 - 11:51 AM
Last post by: Web Design Seo
No New Posts Pinned: Scraper Plugin In Wordpress Aggregator Ultimate
grabber functions in seo tool WP aggregator
0 Web Design Seo 27,461 8th June 2011 - 10:11 AM
Last post by: Web Design Seo


 



RSS Lo-Fi Version Time is now: 1st June 2026 - 09:57 PM