Downloading Text and Binary Objects with cURL

image_pdfimage_print

Many orchestration and automation processes will need to download content from external or internal sources over protocols like HTTP and FTP. The simple way to do this is to leverage lightweight, commonly supported and available tools. The most common and popular tool that I’ve found for managing scripting of downloads inside configuration management systems and other script management tools is with cURL.

What is cURL?

cURL stands for command Line URL and is a simple, yet powerful, command line utility that gives the ability to download content using a lightweight executable that provides cross-platform support. cURL is community supported and is often a packaged part of some *nix systems already.

You can download revisions of cURL for a varying set of platforms from https://curl.haxx.se/download.html even including AmigaOS if you so desire 🙂

Why use cURL?

The most common comparative tool to cURL is wget. There is a fully featured matrix of options that are available across a number of different tools, but for simplicity, cURL and wget tend to be the goto standards for *nix and Windows systems because of the small footprint and flexibility.

cURL and wget have many similarities including:

  • download via HTTP, HTTPS, and FTP
  • both command line tools with multiple platforms supported
  • support for HTTP POST requests

cURL does provide additional feature support that isn’t available from wget including:

  • many protocols including DICT, FILE, FTP, FTPS, Gopher, HTTP, HTTPS, IMAP, IMAPS, LDAP, LDAPS, POP3, POP3S, RTMP, RTSP, SCP, SFTP, SMB, SMTP, SMTPS, Telnet and TFTP. curl supports SSL certificates, HTTP POST, HTTP PUT, FTP uploading, HTTP form based upload, proxies, HTTP/2, cookies, user+password authentication (Basic, Plain, Digest, CRAM-MD5, NTLM, Negotiate and Kerberos), file transfer resume, proxy tunneling and more. (source: curl.haxx.se)
  • API support with using libcurl across platforms

Let’s take a look at our example code to see how to make use of cURL.

Downloading HTML or Text with cURL

It’s frighteningly simple to download text and non-binary objects with cURL. You simply use this format:

curl <source URL>

This will download the target URL and output to STDOUT, which will be to the console in most cases.

curl-google

If you wanted to output it to a file, you just add -o to the command line with a target file name (note: that is a lower case o):

curl-google-file

Downloading Binary content with cURL

If we had a binary file, we obviously can’t write it to STDOUT on the console, or else we would get garbage output. Let’s use this image as an example:

target-still-real

The raw URL for this file is https://raw.githubusercontent.com/discoposse/memes/master/StillReal.jpg to use for our examples.

Using the same format, we would try to do a curl https://raw.githubusercontent.com/discoposse/memes/master/StillReal.jpg which gives us this rather ugly result:

curl-binary-fail

To download a binary file, we can use the -O parameter which pulls down the content exactly as the source file specified dictates, including the name as such:

curl -O https://raw.githubusercontent.com/discoposse/memes/master/StillReal.jpg

curl-binary-success

It isn’t that the -O was required to succeed, but it meant that it treated the output exactly as the input and forced it to output to a file with the same name as the source on the target filesystem. We can achieve the same result by using the -o parameter option also and specifying a target filename:

curl https://raw.githubusercontent.com/discoposse/memes/master/StillReal.jpg -o StillReal.jpg

binary-specified-name-same

This is handy if you want to change the name of the file on the target filesystem. Let’s imagine that we want to download something and force a different name like anyimagename.jpg for example:

curl https://raw.githubusercontent.com/discoposse/memes/master/StillReal.jpg -o anyimagename.jpg

binary-specified-any-name

You can pass along all sorts of variable goodness into the name for output when you want to do something programmatically, which is all sorts of awesome as you get to using cURL for automated file management and other neat functions.

We will tackle more cURL options again in the future, but hopefully this is a good start!

DiscoPosse

People, Process, and Technology. Powered by Community!

You might also like

LEAVE A COMMENT

Proudly Sponsored By

Advertisement

GC On-Demand

Subscribe to the Blog

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

Upcoming events:

Archives