Programming Question

refer to attachment for details

Save Time On Research and Writing
Hire a Pro to Write You a 100% Plagiarism-Free Paper.
Get My Paper

ANL351
SAS Programming and Its Application
Tutor-Marked Assignment
July 2023 Presentation
ANL351
Tutor-Marked Assignment
TUTOR-MARKED ASSIGNMENT (TMA)
This assignment is worth 24% of the final mark for ANL351 SAS Programming and Its
Application.
The cut-off date for this assignment is 14 September 2023. 2355hrs.
Up to 25 marks of penalties will be imposed for inappropriate or poor paraphrasing. For serious
cases, they will be investigated by the examination department. More information on effective
paraphrasing
strategies
can
be
found
on
https://academicguides.waldenu.edu/writingcenter/evidence/paraphrase/effective.
If your course involves programming, you are urged to read the following articles as well:
https://wiki.cs.astate.edu/index.php/Plagiarism_in_a_Programming_Context
https://www.turnitin.com/blog/plagiarism-and-programming-how-to-code-withoutplagiarizing-2
Note to Students:
Compose your report using Microsoft Office Word, and save either as .doc or .docx
(preferred).
You are to include the following particulars in your submission: Course Code, Title of the
TMA, SUSS PI No., Your Name, and Submission Date.
Use of Generative AI Tools (Allowed)
The use of generative AI tools is allowed for this assignment.

You are expected to provide proper attribution if you use generative AI tools while
completing the assignment, including appropriate and discipline-specific citation, a
table detailing the name of the AI tool used, the approach to using the tool (e.g. what
prompts were used), the full output provided by the tool, and which part of the output
was adapted for the assignment;

To take note of section 3, paragraph 3.2 and section 5.2, paragraph 2A.1 (Viva Voce)
of the Student Handbook;

The University has the right to exercise the viva voce option to determine the authorship
of a student’s submission should there be reasonable grounds to suspect that the
submission may not be fully the student’s own work.

For more details on academic integrity and guidance on responsible use of generative
AI tools in assignments, please refer to the TLC website for more details;

The University will continue to review the use of generative AI tools based on feedback
and in light of developments in AI and related technologies.
SINGAPORE UNIVERSITY OF SOCIAL SCIENCES (SUSS)
Page 2 of 4
ANL351
Tutor-Marked Assignment
Question 1
The text file “crx.csv” contains records of 690 credit applications. Each application consists of
15 attributes. All attribute names and values have been changed to meaningless symbols to
protect confidentiality of the data. Therefore, their variable names are denoted as A01 to A15.
The result of each credit application is recorded in the variable A16 where “+” indicates that
the credit application was successful while the disapproved cases are marked as “-”. Below is
a list of the possible values or data type of each variable:
A01:
A02:
A03:
A04:
A05:
A06:
A07:
A08:
A09:
A10:
A11:
A12:
A13:
A14:
A15:
A16:
b, a.
continuous.
continuous.
u, y, l, t.
g, p, gg.
c, d, cc, i, j, k, m, r, q, w, x, e, aa, ff.
v, h, bb, j, n, z, dd, ff, o.
continuous.
t, f.
t, f.
continuous.
t, f.
g, p, s.
continuous.
continuous.
+, –
Note: You are required to adhere to the following requirements for EACH part of the question:
• Do not change anything in the original bestsellers.csv file. Marks will be deducted if you
do so.
• Copy and paste your SAS program as monospaced font TEXT (not screenshot!).
• Copy and paste the resulting SAS log OR output window as SCREENSHOT in your report
(it is totally sufficient to show a part of the log/output window in ONE screenshot). The
screenshots should not have a height of more than 7cm.
• Add comments to make your program clear and readable.
(a)
Create a library called “ANL351” in your SAS environment.
(b)
Execute a SAS DATA step to ONLY import and convert “crx.csv” to a SAS data file
called “CRX” which should be stored in the library “ANL351”. Specify the format of
the column accordingly with the corresponding text length adjustment (No PROC step
is allowed here).
(10 marks)
(c)
Implement the corresponding PROC steps to check the observation lengths and file
sizes of the “CRX” dataset, as well as the path in which the data are stored. Report their
number of observations and variables as well.
(10 marks)
SINGAPORE UNIVERSITY OF SOCIAL SCIENCES (SUSS)
(5 marks)
Page 3 of 4
ANL351
Tutor-Marked Assignment
(d)
Based on the dataset “CRX”, create a new dataset called “CRX_NOMISS” with a new
DATA step, and remove all the observations in which at least one missing value exists
(could be in any column). Report the number of applications that have been deleted in
this process as well as the approval result of those deleted cases (Note: you cannot use
the new dataset that does not contain the removed rows for this particular task).
(15 marks)
(e)
Apply an adequate PROC step to print the distribution of the categories in A05 among
the “disapproval” cases in the dataset without the missing observations which you have
created in (d). How many applications does each category have here?
(10 marks)
(f)
Discuss whether it is more generally likely to get the approval if someone belongs to
category “b” in A01 than to category “a”? Construct the necessary DATA step(s) for
this purpose (no PROC step is allowed here).
(20 marks)
(g)
After carrying out a few analyses, the researcher decided to do some subsetting of the
dataset based on his findings. First, he believes that all continuous variables except A14
and A15 have no impact on the approval outcome. Second, all cases with a value of 0
in A14 OR A15 are irrelevant for his research. As a result, he would like to remove
these observations entirely from the “crx_nomiss” dataset if they fulfil either one of the
conditions. Write a DATA step to generate a subset of the dataset without the missing
observations created in part (d). Report the number of remaining records in the new
dataset.
(10 marks)
(Total: 80 marks)
Question 2
Explain how the DATA step you used in Question 1b for the import of the “CRX” dataset is
processed by the Program Data Vector (PDV) (max. 200 words). Use the import of the
“crx.csv” file as per Question 1b to specify your explanation here.
(20 marks)
—- END OF ASSIGNMENT —-
SINGAPORE UNIVERSITY OF SOCIAL SCIENCES (SUSS)
Page 4 of 4

Still stressed from student homework?
Get quality assistance from academic writers!

Order your essay today and save 25% with the discount code LAVENDER