Objective: Web scraping with Python
Due: December 7 (11:59pm). 10 pts will be deduced for each late day (24hr). No submission will be accepted after Dec 10 (11:59pm). Total grade for the project is 100.
Grading Procedures: All submission will be checked with a plagiarism software. Submission having more than 70% similarity to any other student submission and/or internet resources will share total points the assignment. For example, 4 submissions having more than 70% similarity will be graded as 100/4 = 25pts, assuming that the program is worth of 100 pts.
Description: The university maintains course schedules at http://appsprod.tamuc.edu/Schedule/Schedule.aspx for different semesters (spring, fall, winter, etc ). You will develop a Python program to dynamically complete certain tasks, such as list, find, sort, and save, in course listings from schedule portal. You will mainly use “request” and “BeautifulSoup” libraries (or similar, see exercise 12.1). The program will operate at different level: Semester and Department. Your program will be a menu based application. Assume that you project file is myproject.py. Once you run, it will show last 5 semester (fall, spring, summer only, (not winter, may mini))
> python myproject.py
Choose a semester: 1) Sprint 2021 2)Fall 2020 3)Summer II 4)Summer I 5)Spring 2020
Here, your program will parse the data from website and show only last (most recent) 5 semesters. User will make selection, then, you will show departments for the selected semester (Fall 2020). Note that selected semester is visible before a “>” sign.
Fall 2020> Select a department:
2) Accounting and Finance
4) Ag Science & Natural Resources
30) Social Work
Fall 2020> Art > Select an option:
1) List courses by instruction name
2) List courses by capacity
3) List courses by enrollment size
4) List courses by course prefix
5) Save courses in a csv file
6) Search course by instruction name
7) Search courses by course prefix
Here, your program will parse the data from website and show all available department then list of tasks. Q (go back) option will take user to previous level.
Course listing output should show the following fields. For instance for course listing for “Fall 2020> Computer Science & Info Sys> List the course by prefix ” should show
PrefixIDSecNameInstructorHoursSeatsEnroll.COSC130101WIntro to CompuLee, Kwang33510COSC143601EIntro to Comp Sci & ProgBrown, Thomas44036COSC143601LIntro to Comp Sci & ProgBrown, Thomas4036COSC143601WIntro to Comp Sci & ProgHu, Kaoning44543COSC143602EIntro to Comp Sci & ProgHu, Kaoning43532
as first 5 rows.
You will follow above headers and order (prefix (col. width 6), ID (5), Sec (5) ,Name (25), Inst (20), Hours (5), Seats (5), Enroll. (7) ) for other listing selections too. Data cell should be aligned with column header and left justified. A course name should not have a word more than 5 chars. For instance Algorithms should be abbreviated as “Algor”. The length of course name will not exceed 25 chars. In option 5, the above format should be used to save a listing to a file as .csv format. User will be able to provide a filename for csv file.
For this program you need to develop at least one class (chapter 10) with (possible) many methods.