Re: [Hampshire] extracting phrases from a file.

Top Page

Reply to this message
Author: James Courtier-Dutton
Date:  
To: Hampshire LUG Discussion List
Subject: Re: [Hampshire] extracting phrases from a file.
Hi,

I forgot to mention, my starting document is not a valid http document
so probably will not load into a web browser.
Which what you have said still work?
I need this to be run as a cron job, so use of a web browser is
probably not the best solution.

On 12 September 2011 10:21, Benjie Gillam <benjie@???> wrote:
> Or, alternatively, open it into a decent web browser and type this into the JavaScript console:
>
> var as = document.getElementsByTagName('a'); var hrefs=[]; for (var i = 0, l = as.length; i<l; i++) {if (as[i].href) hrefs.push(as[i].href);} console.log(hrefs.join("\n"));
>
> Cheers,
>
> Benjie.
>
> On 12 Sep 2011, at 10:17, James Courtier-Dutton wrote:
>
>> Hi.
>>
>> I have a large file that contains snips of http pages.
>> Each line is like this:
>> ....some junk.....<a href="some url"></a>
>>
>> I want extract the "some url" bits. I.e. Remove the href.
>> You can probably do this quite easily in perl.
>> Are there any nice short programs to do this?
>> Is it easier to do in some other language?
>>
>> Kind Regards
>>
>> James
>>
>> --
>> Please post to: Hampshire@???
>> Web Interface: https://mailman.lug.org.uk/mailman/listinfo/hampshire
>> LUG URL: http://www.hantslug.org.uk
>> --------------------------------------------------------------
>
>
> --
> Please post to: Hampshire@???
> Web Interface: https://mailman.lug.org.uk/mailman/listinfo/hampshire
> LUG URL: http://www.hantslug.org.uk
> --------------------------------------------------------------
>