This function is used to scrape titles (h1, h2 & h3 html tags) from a website. Useful for scraping daily electronic newspapers' titles.

titles_scrap(link, contain = NULL, case_sensitive = FALSE, askRobot = FALSE)

Arguments

link

the link of the web page to scrape

contain

filter the titles according to a character string provided.

case_sensitive

logical. Should the contain argument be case sensitive ? defaults to FALSE

askRobot

logical. Should the function ask the robots.txt if we're allowed or not to scrape the web page ? Default is FALSE

Value

a character vector

Examples

# \donttest{ # Extracting the current titles of the New York Times link <- "https://www.nytimes.com/" titles_scrap(link)# }
#> [1] "Your Monday Briefing" #> [2] "Listen to ‘The Sunday Read’" #> [3] "In the ‘At Home’ Newsletter" #> [4] "The Art of the Lie? The Bigger the Better" #> [5] "In a viral video, Arnold Schwarzenegger linked the Capitol riot to a rampage that was a prelude to the Holocaust." #> [6] "‘It Became Sort of Lawless’: Florida Vaccine Rollout Turns Into a Free-for-All" #> [7] "After unused coronavirus vaccine doses were discarded, New York State loosened rules on who can get the shot." #> [8] "A Year After Wuhan, China Tells a Tale of Triumph (and No Mistakes)" #> [9] "Chicago Is Reopening Schools Against Fierce Resistance From Teachers" #> [10] "Indonesia Crash Thwarts Push to Rehabilitate Country’s Airlines" #> [11] "Here’s what we know about the Boeing plane in the Indonesia crash." #> [12] "Ved Mehta, Celebrated Writer for The New Yorker, Dies at 86" #> [13] "Trump’s Lackeys Must Also Be Punished" #> [14] "Impeach and Convict Trump. Congress Must Defend Itself." #> [15] "The Narcissist in Chief Brings It All Crashing Down" #> [16] "Are We Stuck With Trump in the White House?" #> [17] "It Took a Genocide for Me to Remember My Uighur Roots" #> [18] "Trump Just Had to Light the Match" #> [19] "How Trump Made the Fantasy Real" #> [20] "Trump’s Capitol Offense" #> [21] "Were These the Fingerprints of a Terrorist?" #> [22] "I Desegregated the University of Georgia. History Is Still in the Making." #> [23] "What Would David Bowie Do?" #> [24] "52 Places to Love in 2021" #> [25] "With ‘I Hate Men,’ a French Feminist Touches a Nerve" #> [26] "Book Review: How Comey’s View of Justice Differs From Trump’s" #> [27] "Site Index" #> [28] "Site Information Navigation" #> [29] "How a String of Failures Led to a Deadly Siege at the Capitol" #> [30] "Arrests Across Nation as D.C. Mayor Warns of Further Violence" #> [31] "House Moves to Force Trump Out, Vowing Impeachment if Pence Won’t Act" #> [32] "The Times analyzed the speech President Trump gave before his supporters rushed the Capitol." #> [33] "Parler, a Chosen App of Trump Fans, Has Become a Test of Free Speech" #> [34] "We Worked Together on the Internet. Last Week, He Stormed the Capitol." #> [35] "Stripped of Twitter, Trump Faces a New Challenge: How to Get Attention" #> [36] "Dozens have been charged after the riot. Here are some of the notable arrests." #> [37] "Opinion" #> [38] "Editors’ Picks" #> [39] "Advertisement"