CITS1401 Computational Thinking with Python Project 01 Sem 01 Brief | UWA Project Description You should construct a Python 3 program containing your solution to the following
CITS1401 Computational Thinking with Python Project 01 Sem 01 Brief | UWA
Project Description
You should construct a Python 3 program containing your solution to the following problem and submit your program electronically on Moodle. The name of the file containing your code should be your student ID, e.g., 12345678. py. No other method of submission is allowed. Please note that this is an individual project. Your program will be automatically run on Moodle for some sample test cases provided in the project sheet if you click the “check” link. However, this does not test all required criteria, and your submission will be manually tested thoroughly for grading purposes after the due date. Remember, you need to submit the program as a single file and copy-paste the same program in the provided text box. You have only one attempt to submit, so do not submit until you are satisfied with your attempt. All open submissions at the time of the deadline will be automatically submitted. Once your attempt is submitted, there is no way in the system to open/reverse/modify it.
You are expected to have read and understood the University's guidelines on academic conduct. By this policy, you may discuss with other students the general principles required to understand this project, but the work you submit must be the result of your effort. Plagiarism detection and other systems for detecting potential malpractice will therefore be used. Besides, if what you submit is not your work, then you will have learned little and will therefore likely fail the final exam.
You must submit your project before the deadline mentioned above. Following UWA policy, a late penalty of 5% will be deducted for each day, i.e., 24 hours after the deadline, that the assignment is submitted. No submissions will be allowed after 7 days following the deadline, except approved special consideration cases.
Project Overview
Background: The Australian Bureau of Statistics (ABS) has commissioned your services to develop a powerful data analysis tool that can provide insights into the population distribution across Australia's cities and regions. The government and urban planners require this tool to make informed decisions regarding infrastructure development, resource allocation, and future urban planning. Your task is to develop an analytical tool that processes real-world population datasets and generates insightful statistics.
Data: The data for this project is the population information by areas and ages, distributed in two data files. You need to find the correct data association across files. The files include the codes and names of Australian states, statistical areas (Level 2 & 3), and different age population groups living in these areas.

Task: You are required to write a Python 3 program that will read two CSV files. After reading the files, your program is required to complete the following tasks. More details are given in the Output specification section.
- Find the age group that contains a specific input age.
- Calculate population statistics for two specific SA3 areas.
- Find the SA3 area with the largest population in the age group, for each unique state, and its percentage.
- Calculate the correlation between the age structure of two specific SA2 areas.
Requirements
- You are not allowed to import any external or internal module in Python. While use of many of these modules, e.g., csv or math is a perfectly sensible thing to do in a production setting, it takes away much of the point of different aspects of the project, which is about getting practice opening text files, processing text file data, and use of basic Python structures, in this case lists and loops.
- Ensure your program does NOT call the input() function at any time. Calling the input() function will cause your program to hang, waiting for input that the automated testing system will not provide (in fact, what will happen is that if the marking program detects the call(s), it will not test your code at all which may result in zero grade).
- Your program should also not call the print() function at any time except for the case of graceful termination (if needed). If your program has encountered an error state and is exiting gracefully, then your program needs to return zero (for number), None (for string), or an empty list (for list) and print an appropriate message. At no point should you print the program’s outputs instead of (or in addition to) returning them, or provide a printout of the program’s progress in calculating such outputs.
- Do not assume that the input file names will end in .csv. File name suffixes such as .csv and .txt are not mandatory in systems other than Microsoft Windows. Do not enforce that within your program that the file must end with a .csv or any other extension (or try to add an extension onto the provided CSV file argument), as doing so can easily lead to a syntax error and losing marks.
Submit Your Assignment Questions & Get Plagiarism Free Answers.
Order Non Plagiarized Assignment
Input
Your program must define the function main with the following syntax:
def main(csvfile_1, csvfile_2, age, sa2_1, sa2_2):
The input arguments for this function are:
- IP1 (csvfile_1): The name of the CSV file (as a string) containing the relationship data between different statistical levels of areas for each state. The first row of the CSV file will contain the headings of the columns. A sample CSV file, “SampleData_Areas.csv” is provided with the project sheet on LMS and Moodle.
- IP2 (csvfile_2): The name of the CSV file (as a string) containing the record of the population. The first row of the CSV file will contain the headings of the columns. A sample CSV file, “SampleData_Populations.csv”, is provided with the project sheet on LMS and Moodle.
- IP3 (age): Age as an integer.
- IP4 (sa2_1): String containing the code of an SA2 area.
- IP5 (sa2_2): String containing the code of another SA2 area.
Output
We expect 4 outputs in the order below.
OP1: Output will be a list including two integers, indicating the lower bound and the upper bound of the age group containing the input age (IP3: age). Use None as the list element if one of the bounds does not exist.
Note: The age group identified in this output will be used to calculate the other outputs.
OP2: Output will be a list of two lists.
The first list includes three elements in the order below:
the code of the SA3 area where input IP4 (sa2_1) is located,
The average of populations in the identified age group (OP1), across all SA2 areas in this SA3 area,
The standard deviations of populations in the identified age group (OP1) across all SA2 areas in this SA3 area.
The second list is similar to the first one, but for input IP5 (sa2_2).
OP3: Output will be a list of lists (s). Each inner list corresponds to a unique state in the data, including three elements in the order below:
- the state name,
- The name of the SA3 area with the largest population in the identified age group (OP1), in the state,
- the percentage of the population that you found above for the total population across all age groups in the same SA3 area.
The inner list(s) should be sorted in alphabetical ascending order by the state name. Hint: Look for sort() or sorted() function.
When there are multiple areas with the same largest population, choose the first one in alphabetical order in terms of area code.
OP4: Output will be a float number, which is the correlation coefficient between the populations in each age group in the first input IP4 (sa2_1), and the second input IP5 (sa2_2).
All returned numeric outputs (both in lists and individual) must contain values rounded to four decimal places (if required to be rounded off). Do not round the values during calculations. Instead, round them only at the time when you save them into the final output variables.
Examples
Download the SampleData_Areas.csv and SampleData_Populations.csv files from the folder of Project 1 on LMS or Moodle. An example of how you can call your program from the Python shell and examine the results it returns is provided below:
>> OP1, OP2, OP3, OP4 = main('SampleData_Areas.csv', 'SampleData_Populations.csv', 18, '401011001', '401021003')
The returned output variables are:
>> OP1 [15, 19]
>> OP2
[['40101', 782.5, 376.8879], ['40102', 689.625, 493.9609]]
>> OP3
[['south australia', 'onkaparinga', 0.0595], ['tasmania', 'launceston', 0.0591], ['western australia', 'wanneroo', 0.0694]]
>> OP4 0.0154
Buy Answer of This Research Proposal & Raise Your Grades
Request to Buy Answer
Assumptions
Your program can assume the following:
- Anything that is meant to be a string (e.g., names and codes of states and areas) will be a string, and anything that is meant to be numeric will be numeric in the CSV file.
- All string data in the CSV files is case-insensitive, which means “Perth” is the same as “Perth”. Your program needs to handle the situation to consider both to be the same. Similarly, your program needs to handle the string input parameter in the same way. All string outputs are also treated case-insensitively. The output must contain all strings in lowercase.
- The order of columns in each row will follow the order of the headings provided in the first row. However, rows can be in random order except the first row containing the headings.
- No data will be missing in the CSV files; however, values can be zero and must be accounted for in mathematical calculations.
[In case any part of the calculation cannot be performed due to zero values or other boundary conditions, do a graceful termination by printing an error message and returning a zero value (for number), None (for string) or empty list (for list) depending on the data type. Your program must not crash.] - The number of states, SA3 areas, and SA2 areas will vary, so do not hardcode.
- The main function will always be provided with valid input parameters.
- The necessary formulas are provided at the end of this document.