Adding new functionalities:

  • pdf_scrap
  • xls_scrap
  • xlsx_scrap
  • csv_scrap
  • comments_scrap
  • Fixing bugs in tests.
  • Adding the images_scrap() function which allows the user to download images from a website
  • Adding the images_preview() function which allows the user to list the images url from a specific web page
  • Wrapping paragraphs_scrap() within a tryCatch function.
  • adding the string::stri_remove_empty() to the scrap() function in order to remove empty elements

I have wrapped ralger functions inside a tryCatch() function. Now ralger detects catches the following errors:

  • No internet connection: In this case, ralger displays a message and returns NA
  • Invalid link: package’s functions display an informative message and returns also NA
  • code cleaning
  • removing some dependencies
  • modifying the message displayed by the robot.txt
  • now titles_scrap() scrapes h1, h2 & h3 (previously only h1 and h2)

1- Thanks to Ezekiel (ctb) I’ve added an argument to the table_scrap() function which is fill (the argument is from the rvest package), the user has now the ability to set fill = TRUE when dealing with tables with inconsistent number of columns/rows.

1- I’ve added the choose argument to the table_scrap() function, which allows the user to choose which table to extract.

2- I’ve added two functions: + titles_scrap() for scraping titles from a website. + paragraphs_scrap() for scraping paragraphs from a website.

Two new functions: - table_scrap(): allows the user to extract an HTML table from a web page.
- weblink_scrap(): allows the user to extract all web links within a web page.

Also, introducting a new arguments within each function : askRobot

Initial release

  • Added a NEWS.md file to track changes to the package.