JStudio SiteWalker Service Board Forum Index JStudio SiteWalker Service Board
Find out / share information about our product, experience and requirements. To serve all user all postings must be written in English language. We are looking for constructive meanings and experience. Disrespectful postings will be deleted immediately.
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Saving complete HTML documents

 
Post new topic   Reply to topic    JStudio SiteWalker Service Board Forum Index -> General Issues II: Using the web as datasource
View previous topic :: View next topic  
Author Message
Peter Müller
Guest





PostPosted: Thu Mar 09, 2006 8:43 pm    Post subject: Saving complete HTML documents Reply with quote

Hello,

is it possible to extract and save complete web documents instead of extracting and saving single values or HTML elements (what works fine..)?

Thank you for your help.

best regards,

Peter Müller
Back to top
contentsaver
Guest





PostPosted: Thu Mar 16, 2006 10:59 am    Post subject: exctract page element Reply with quote

Hi,

saving as web document is not supported. But you can select the HTML Tag itself as step to extract. Then all content will be saved to excel or file.
But it is only the source HTML of the current web document without any included sources (images etc).

try it.

Bye,
Back to top
Rolf
Guest





PostPosted: Fri Mar 17, 2006 9:40 am    Post subject: Save content with JavaScript Reply with quote

Hi,

we use sitewalker just to process automatic navigation. So for each document we want to collect and save to local, we specified a test that navigates to it.
then we added a "save" step. also we write a javascript file that saves the documents content - the body tag - to local. so the save step just executes the script that saves the content to local.
note: to do this you must set internet explorer security to low level. but this process supports also only saving text without any images.

here the script that extracts the displayed text:

//###############
var fso, textFile;
var ForWritingFlag = 8;

//ForReading 1 Open a file for reading only. You can't write to this file.
//ForWriting 2 Open a file for writing only. You can't read from this file.
//ForAppending 8 Open a file and write to the end of the file.

fso = new ActiveXObject("Scripting.FileSystemObject");
textFile = fso.OpenTextFile("c:\\test.txt", ForWritingFlag , true);

textFile.Write (document.getElementsByTagName("BODY")[0].innerText);
textFile.Close();
//#################

we guess this is the easiest way to save complete HTML content. And we have total control of all saving activities & format.

regards,
Back to top
Display posts from previous:   
Post new topic   Reply to topic    JStudio SiteWalker Service Board Forum Index -> General Issues II: Using the web as datasource All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group