cURL Installation
- download carcet.pem from https://curl.haxx.se/docs/caextract.html
- download curl from https://curl.haxx.se/download.html (I use a windows system, so I download the binary file for windows 64 bit)
- extract curl to c:\curl and put carcet.pem under c:\curl\bin folder
- add environment variable curl with the path of curl.exe
Curl is useful for downloading files from the website. The basic command is the following:
curl -O url
This command needs be run under the c:\curl\bin folder. The files are downloaded to C:\Users\username\
Purpose: download SAS 9.3 user guide pdf files from https://support.sas.com/documentation/onlinedoc/stat/930/
- In window cmd, under c:\curl\bin, run curl -o index https://support.sas.com/documentation/onlinedoc/stat/930/
- generate the index file which contains the wrapped source html code of the webpage
- open git bash, run following
-
- cd c
- cd curl
- cd bin
- grep -i pdf index > list
-
- list contains the href=”*.pdf”. Use Excel text to column to get only the name of the pdf files.
- open list in notepad++ and at the bottom of the window, it shows “Windows (CR LF)”, right click and select “Unix (LF)”. This will solve the error “curl: (3) Illegal characters found in URL”
- Start a new bash file in notepad++ with the following code
- echo “Start!”
url=https://support.sas.com/documentation/onlinedoc/stat/930/
while read query
do
curl -O “$url${query}”
echo $url${query}
done < list - save as echo
- echo “Start!”
- in git bash, navigate to where the echo file is and run following
- bash echo