Re: Getting HTML from web pages in unix scripts

Date view Thread view Subject view Author view Attachment view

From: Lucas Rockwell (lr@socrates.berkeley.edu)
Date: Tue Dec 10 2002 - 16:52:52 PST


Hi Susan,

To be more to the point, curl works great with https:

% curl -O https://calnet.berkeley.edu/index.html

will download the CalNet index.html file to my current directory. To works
great!

-lucas

On Tue, 10 Dec 2002, Lucas Rockwell wrote:

> Hi Susan,
>
> If you are using Mac OS X, it comes with curl. curl -O <some url> is the
> basic way of doing it and the output is the same name as the page, files,
> etc., you put in the url (with the rest of the url cut off). (That is an
> uppercase "oh", not a zero).
>
> Here is the beginning of the man page:
>
> curl(1) Curl Manual curl(1)
>
> NAME
> curl - get a URL with FTP, TELNET, LDAP, GOPHER, DICT,
> FILE, HTTP or HTTPS syntax.
>
> SYNOPSIS
> curl [options] [URL...]
>
> DESCRIPTION
> curl is a client to get documents/files from or send docu-
> ments to a server, using any of the supported protocols
> (HTTP, HTTPS, FTP, GOPHER, DICT, TELNET, LDAP or FILE).
> The command is designed to work without user interaction
> or any kind of interactivity.
>
> curl offers a busload of useful tricks like proxy support,
> user authentication, ftp upload, HTTP post, SSL (https:)
> connections, cookies, file transfer resume and more.
>
> It looks like it will probably do what you want and then some.
>
> -lucas
>
> On Tue, 10 Dec 2002, Susan Mathews wrote:
>
> > It is often useful to be able to get the HTML generated by another webpage
> > (in our evironment we would mostly do this in perl or unix shell scripts).
> > Long ago we used lynx and some people used the related wget, I think; more
> > recently we have used webget
> > (http://asis.web.cern.ch/asis/products/PERL/jfriedl-tools.html). Does
> > anyone had advice on more modern tools, especially ones that can handle
> > https as well as http connections? I ran across cURL,
> > http://curl.haxx.se/ which seems to fit the bill, does anyone know if it
> > works or it there are major caveats for its use?
> > Susan
> >
> > -----------------------------------------------------------------------
> > The following was automatically added to this message by the list server:
> >
> > Webnet information is available at <URL:http://webnet.berkeley.edu/>.
> >
>
> -----------------------------------------------------------------------
> The following was automatically added to this message by the list server:
>
> Webnet information is available at <URL:http://webnet.berkeley.edu/>.
>

-----------------------------------------------------------------------
The following was automatically added to this message by the list server:

Webnet information is available at <URL:http://webnet.berkeley.edu/>.


Date view Thread view Subject view Author view Attachment view

This archive was generated by hypermail 2.1.5 : Tue Dec 10 2002 - 16:57:19 PST