Info Finder Challenges

After completing the code-along, attempt the challenges below.

Optional Practice: Bug Fixing

Find and fix all of the bugs in the programs below. Note that the projects may have multiple bugs - fix all of them!

1. All Paragraphs

Get back into the Info Finder code. Update it so that instead of simply displaying the first paragraph, it displays all of them.

Updating the Function

Find the get_information function in the code
In the body of the function, find the for paragraph in paragraphs loop
Above the loop, create a new variable named para_texts
Set the variable equal to a new empty list: []
- This will hold the text from each paragraph
In the body of the loop, find the if clean_text: statement
In the body of the if, remove the return statement
In its place, add clean_text to the para_texts list
Outside of the for loop, remove the return "No Info..." statement
In its place, return the para_texts list

The program won't work just yet... it will be necessary to update the call to the get_information function.

Updating the Call

Find where the get_information function is called
Change the variable to be named info_paras
Create a new for loop to loop through each paragraph text
For each text, wrap it in a call to the fill function and print it
Above the loop, create an if statement
For the condition, check if len(info_paras) is 0
If it is, print a message saying "No Information Found"

Run the program and verify that all the paragraphs appear!

BONUS: Custom Number of Paragraphs

Instead of simply printing out all of the paragraphs, ask the user how many to print! In the for loop body, after printing each paragraph, ask the user if they would like to continue. If they choose not to continue, break out of the loop.

2. Printing a Table of Contents

In addition to finding the paragraphs of information on each page, find the table of contents information.

In the get_information function, find the html_document variable
Under that variable, create a variable named toc_search
Set the toc_search variable to be a new dictionary: {}
In the toc_search dictionary, add a key of "id" with a value of "toc"
Under that, create a variable named toc
Set the toc variable to equal a call to html_document.find
For the find call, pass in "div" for the element type as the first parameter
For the second parameter, pass in the toc_search variable
Under that, create a variable named toc_text
Set the toc_text variable to equal toc.get_text()
- This will return the text content for the table of contents
Finally, print out the toc_text variable

Run the code, enter a search term, and verify that the table of contents text appears!

3. Philosophy Game

This challenge will be quite challenging.

One interesting thing about Wikipedia articles is that if you click on the first non-parenthesized, non-italicized link on the page, and repeat the process, it will almost always lead to the Philosophy page. Check out this article for more information. There is also this video about the subject:

Recreate the philosophy game using a Python script! The program should work as follows:

Ask the user for a page to visit
Load the page using requests/BeautifulSoup
Find the first link on the page
- Avoid references to the same page, and citation links
- For an extra challenge, avoid parenthetical links as well
Load the page at the new URL
Repeat the process!

Note that "Philosophy" may be unreachable if parenthetical links are included. It will still be interesting to see where the journey goes though!

4. Scraping Another Webpage

This challenge could be challenging depending on the website.

Experiment with different websites! Try to extract data from an interesting place. Try to make it dynamic; create programs that ask the user what to do. Theoretically, any website could be scraped; it's all about looking at the HTML and figuring out how to grab the information.

Info Finder Challenges

Info Finder Challenges

Optional Practice: Bug Fixing

1. All Paragraphs

Updating the Function

Updating the Call

BONUS: Custom Number of Paragraphs

2. Printing a Table of Contents

3. Philosophy Game

4. Scraping Another Webpage

results matching ""

No results matching ""