This function is used to scrape one element from a website.

scrap(link, node, clean = FALSE, askRobot = FALSE)

Arguments

link

the link of the web page to scrape

node

the HTML or CSS element to consider, the SelectorGadget tool is highly recommended

clean

logical. Should the function clean the extracted vector or not ? Default is FALSE.

askRobot

logical. Should the function ask the robots.txt if we're allowed or not to scrape the web page ? Default is FALSE.

Value

a character vector

Examples

# \donttest{ # Extracting imdb top 250 movie titles link <- "https://www.imdb.com/chart/top/" node <- ".titleColumn a" scrap(link, node)# }
#> [1] "The Shawshank Redemption" #> [2] "The Godfather" #> [3] "The Godfather: Part II" #> [4] "The Dark Knight" #> [5] "12 Angry Men" #> [6] "Schindler's List" #> [7] "The Lord of the Rings: The Return of the King" #> [8] "Pulp Fiction" #> [9] "Il buono, il brutto, il cattivo" #> [10] "The Lord of the Rings: The Fellowship of the Ring" #> [11] "Fight Club" #> [12] "Forrest Gump" #> [13] "Inception" #> [14] "The Lord of the Rings: The Two Towers" #> [15] "Star Wars: Episode V - The Empire Strikes Back" #> [16] "The Matrix" #> [17] "Goodfellas" #> [18] "One Flew Over the Cuckoo's Nest" #> [19] "Shichinin no samurai" #> [20] "Se7en" #> [21] "La vita è bella" #> [22] "Cidade de Deus" #> [23] "The Silence of the Lambs" #> [24] "It's a Wonderful Life" #> [25] "Star Wars" #> [26] "Saving Private Ryan" #> [27] "Sen to Chihiro no kamikakushi" #> [28] "The Green Mile" #> [29] "Interstellar" #> [30] "Gisaengchung" #> [31] "Léon" #> [32] "The Usual Suspects" #> [33] "Seppuku" #> [34] "The Lion King" #> [35] "The Pianist" #> [36] "Back to the Future" #> [37] "Terminator 2: Judgment Day" #> [38] "American History X" #> [39] "Modern Times" #> [40] "Psycho" #> [41] "Gladiator" #> [42] "City Lights" #> [43] "The Departed" #> [44] "The Intouchables" #> [45] "Whiplash" #> [46] "Hotaru no haka" #> [47] "The Prestige" #> [48] "Once Upon a Time in the West" #> [49] "Casablanca" #> [50] "Nuovo Cinema Paradiso" #> [51] "Rear Window" #> [52] "Alien" #> [53] "Apocalypse Now" #> [54] "Hamilton" #> [55] "Memento" #> [56] "The Great Dictator" #> [57] "Raiders of the Lost Ark" #> [58] "Django Unchained" #> [59] "The Lives of Others" #> [60] "Joker" #> [61] "Paths of Glory" #> [62] "WALL·E" #> [63] "The Shining" #> [64] "Avengers: Infinity War" #> [65] "Sunset Blvd." #> [66] "Witness for the Prosecution" #> [67] "Oldeuboi" #> [68] "Mononoke-hime" #> [69] "Spider-Man: Into the Spider-Verse" #> [70] "Dr. Strangelove or: How I Learned to Stop Worrying and Love the Bomb" #> [71] "The Dark Knight Rises" #> [72] "Once Upon a Time in America" #> [73] "Aliens" #> [74] "Kimi no na wa." #> [75] "Coco" #> [76] "Avengers: Endgame" #> [77] "American Beauty" #> [78] "Braveheart" #> [79] "Das Boot" #> [80] "3 Idiots" #> [81] "Toy Story" #> [82] "Capharnaüm" #> [83] "Tengoku to jigoku" #> [84] "Amadeus" #> [85] "Inglourious Basterds" #> [86] "Star Wars: Episode VI - Return of the Jedi" #> [87] "Good Will Hunting" #> [88] "Taare Zameen Par" #> [89] "Reservoir Dogs" #> [90] "2001: A Space Odyssey" #> [91] "Requiem for a Dream" #> [92] "Vertigo" #> [93] "M - Eine Stadt sucht einen Mörder" #> [94] "Jagten" #> [95] "Eternal Sunshine of the Spotless Mind" #> [96] "Citizen Kane" #> [97] "Dangal" #> [98] "Full Metal Jacket" #> [99] "Ladri di biciclette" #> [100] "Singin' in the Rain" #> [101] "The Kid" #> [102] "North by Northwest" #> [103] "Snatch" #> [104] "1917" #> [105] "A Clockwork Orange" #> [106] "Scarface" #> [107] "Ikiru" #> [108] "Taxi Driver" #> [109] "Idi i smotri" #> [110] "Lawrence of Arabia" #> [111] "Toy Story 3" #> [112] "Amélie" #> [113] "Jodaeiye Nader az Simin" #> [114] "The Sting" #> [115] "Incendies" #> [116] "Metropolis" #> [117] "Per qualche dollaro in più" #> [118] "The Apartment" #> [119] "Double Indemnity" #> [120] "To Kill a Mockingbird" #> [121] "Up" #> [122] "Indiana Jones and the Last Crusade" #> [123] "Heat" #> [124] "L.A. Confidential" #> [125] "Die Hard" #> [126] "Green Book" #> [127] "Monty Python and the Holy Grail" #> [128] "Yôjinbô" #> [129] "Batman Begins" #> [130] "Rashômon" #> [131] "Der Untergang" #> [132] "Bacheha-Ye aseman" #> [133] "Unforgiven" #> [134] "Some Like It Hot" #> [135] "Ran" #> [136] "Hauru no ugoku shiro" #> [137] "All About Eve" #> [138] "Casino" #> [139] "A Beautiful Mind" #> [140] "The Great Escape" #> [141] "The Wolf of Wall Street" #> [142] "Pan's Labyrinth" #> [143] "El secreto de sus ojos" #> [144] "There Will Be Blood" #> [145] "Lock, Stock and Two Smoking Barrels" #> [146] "Tonari no Totoro" #> [147] "Raging Bull" #> [148] "Judgment at Nuremberg" #> [149] "The Treasure of the Sierra Madre" #> [150] "Dial M for Murder" #> [151] "Three Billboards Outside Ebbing, Missouri" #> [152] "Shutter Island" #> [153] "The Gold Rush" #> [154] "Chinatown" #> [155] "Babam ve Oglum" #> [156] "No Country for Old Men" #> [157] "V for Vendetta" #> [158] "Inside Out" #> [159] "Det sjunde inseglet" #> [160] "The Elephant Man" #> [161] "The Thing" #> [162] "Warrior" #> [163] "The Sixth Sense" #> [164] "Trainspotting" #> [165] "Jurassic Park" #> [166] "Klaus" #> [167] "The Truman Show" #> [168] "Gone with the Wind" #> [169] "Finding Nemo" #> [170] "Smultronstället" #> [171] "Blade Runner" #> [172] "Stalker" #> [173] "Kill Bill: Vol. 1" #> [174] "Anand" #> [175] "Salinui chueok" #> [176] "The Bridge on the River Kwai" #> [177] "Fargo" #> [178] "Room" #> [179] "Gran Torino" #> [180] "The Third Man" #> [181] "Soul" #> [182] "Relatos salvajes" #> [183] "On the Waterfront" #> [184] "Tôkyô monogatari" #> [185] "The Deer Hunter" #> [186] "In the Name of the Father" #> [187] "Mary and Max" #> [188] "Höstsonaten" #> [189] "The Grand Budapest Hotel" #> [190] "Gone Girl" #> [191] "Before Sunrise" #> [192] "Hacksaw Ridge" #> [193] "Catch Me If You Can" #> [194] "Persona" #> [195] "Prisoners" #> [196] "Andhadhun" #> [197] "The Big Lebowski" #> [198] "Sherlock Jr." #> [199] "To Be or Not to Be" #> [200] "The General" #> [201] "How to Train Your Dragon" #> [202] "Ford v Ferrari" #> [203] "Eskiya" #> [204] "Barry Lyndon" #> [205] "12 Years a Slave" #> [206] "Mr. Smith Goes to Washington" #> [207] "Mad Max: Fury Road" #> [208] "Million Dollar Baby" #> [209] "Network" #> [210] "Dead Poets Society" #> [211] "Stand by Me" #> [212] "Harry Potter and the Deathly Hallows: Part 2" #> [213] "Ben-Hur" #> [214] "Cool Hand Luke" #> [215] "Hachi: A Dog's Tale" #> [216] "Platoon" #> [217] "Dom za vesanje" #> [218] "Ah-ga-ssi" #> [219] "Logan" #> [220] "Into the Wild" #> [221] "Rush" #> [222] "Le salaire de la peur" #> [223] "Life of Brian" #> [224] "Les quatre cents coups" #> [225] "Spotlight" #> [226] "La haine" #> [227] "La passion de Jeanne d'Arc" #> [228] "Hotel Rwanda" #> [229] "Amores perros" #> [230] "Andrei Rublev" #> [231] "Rocky" #> [232] "Monsters, Inc." #> [233] "Gangs of Wasseypur" #> [234] "Kaze no tani no Naushika" #> [235] "Rebecca" #> [236] "Du rififi chez les hommes" #> [237] "Rang De Basanti" #> [238] "Before Sunset" #> [239] "Fa yeung nin wah" #> [240] "Paris, Texas" #> [241] "Portrait de la jeune fille en feu" #> [242] "It Happened One Night" #> [243] "Vikram Vedha" #> [244] "Contratiempo" #> [245] "The Help" #> [246] "The Princess Bride" #> [247] "La battaglia di Algeri" #> [248] "Mandariinid" #> [249] "Drishyam" #> [250] "Fanny och Alexander"