ECE-C
3
5
3 Systems Programming
Project
1
– Writing a Basic Shell
In this assignment you will be developing a user shell (i.e. command line interface) similar to bash. Upon
completion, your shell must be able to perform the following operations:
• Display the current working directory within the prompt (before the dollar sign).
/home/jshack/pssh$
• Run a single command with optional input and output redirection. Command line arguments must be
supported.
/home/jshack/pssh$ ./my_prog arg1 arg
2
arg3
/home/jshack/pssh$ ls -l > directory_contents.txt
/home/jshack/pssh$ grep -n “test” < input.txt > output.txt
• Run multiple pipelined commands with optional input and output redirection. Naturally, command
line arguments to programs must still be supported.
/home/jshack/pssh$ ./my_prog arg1 arg2 arg3 | wc -l > out.txt
/home/jshack/pssh$ ls -lh | awk ’{print $9 ” is ” $5}’
/home/jshack/pssh$ ps aux | grep bash | grep -v grep | awk ’{print $2}’
• Implement the builtin command ‘exit’, which will terminate the shell.
/home/jshack/pssh$ exit
• Implement the builtin command ‘which’. This command accepts 1 parameter (a program name),
searches the system PATH for the program, and prints its full path to stdout if found (or simply
nothing if it is not found). If a fully qualified path or relative path is supplied to an executable
program, then that path should simply be printed to stdout. If the supplied program name is another
builtin command, your shell should indicate that in a message printed to stdout. The behavior should
be identical to bash’s builtin which command.
/home/jshack/pssh$ which ls
/bin/ls
/home/jshack/pssh$ which lakjasdlkfjasdlkfj
/home/jshack/pssh$ which exit
exit: shell built-in command
/home/jshack/pssh$ which which
which: shell built-in command
/home/jshack/pssh$ which man
/usr/bin/man
Writing a command line shell is obviously going to require that you be able to parse the text the user types
into the prompt into a more meaningful structure that is easier for you (the developer) to handle. Since text
processing is not the major focus of this course (hint: it’s actually systems programming), I have provided
1
you with a parser as well as a basic skeleton for your shell program. I call this shell program skeleton the
Pretty Simple SHell (pssh). The main input loop looks like this:
…
#define DEBUG_PARSE 0
…
while (1) {
cmdline = readline (build_prompt());
if (!cmdline) /* EOF (ex: ctrl-d) */
exit (EXIT_SUCCESS);
P = parse_cmdline (cmdline);
if (!P)
goto next;
if (P->invalid_syntax) {
printf (“pssh: invalid syntax\n”);
goto next;
}
#if DEBUG_PARSE
parse_debug (P);
#endif
execute_tasks (P);
next:
parse_destroy (&P);
free(cmdline);
}
…
As you can see, our command line shell reads input from the user one line at a time via readline(), which
returns a char* pointing to memory allocated on the heap containing a NULL terminated string of what the
user typed in before hitting Enter. This is passed to the provided parse cmdline() API function I have
provided to you. The implementation can be found in parse.c and the API function declarations can be
found in parse.h. The parse cmdline() API returns a Parse*, which points to heap memory. The Parse
structure is defined in parse.h as follows:
typedef struct {
Task* tasks; /* ordered list of tasks to pipe */
int ntasks; /* # of tasks in the parse */
char* infile; /* filename of ’infile’ */
char* outfile; /* filename of ’outfile’ */
int background; /* run process in background? */
int invalid_syntax; /* parse failed */
} Parse;
as is the Task structure:
typedef struct {
char* cmd;
char** argv; /* NULL terminated array of strings */
} Task;
2
As you can see, the Parse data structure contains all of the parsed information from a command line with
following anatomy:
command_1 [< infile] [| command_n]* [> outfile] [&]
where:
• Items in brackets [ ] are optional
• Items in starred brackets [ ]* are optional but can be repeated
• Non-bracketed items are required
In other words, I have done all the “hard work” for you. Your job will simply to be to take a Parse structure
and implement all of the process creation and management logic generally provided by a shell. In order
to make it obvious how to work with a Parse structure, I have provided the parse debug() API function,
which simple prints the contents of a Parse to the screen. Your job, for this project, largely involves writing
logic for the provided execute tasks() function in pssh.c. I recommend you start by simply getting the
execution of single commands working.
Naturally, this is a simple vfork() and exec() problem. Get this working first. Also, remember that exec()
is not a real function, but is rather a term used to refer to the family of exec functions: execl(), execlp(),
execle(), execv(), execvp(), execvpe(). Read the manual page for exec to figure out which one you
want to use!
~$ man exec
Once your pssh is capable of executing simple commands with arguments, such as:
$ ls -l
you need to add support for input and output redirection using the < and > command line operators… just
like in bash! This is actually easier than in seems. Hint: you will need to modify the file descriptors tables.
Once you have this working, you will be able to do great things like:
$ ls -l > directory.txt
$ grep “some phrase” < somefile.txt
Great job! Next step is connecting multiple commands together with pipes!
This is actually very similar to what we did in lecture using the dup2() system call. In fact, it is nearly
exactly the same–just far more practical. Not much needs to be said here other than that, unlike our example
in lecture, you must be able to pipeline more than just two processes – you must pipeline all of the commands
in the tasks array found in the Parse structure.
Since it simply ends the pssh process, the built-in command exit is going to be the easiest. Implement that
first. Now, the which command is going to be a little bit more complicated. I recommend looking at the
3
command found() function in pssh.c and use that as a working base. Also, be sure to read the manual page
for the access() system call:
~$ man access
Keep in mind that, with the exception of exit, any built-in command should support input and output
redirection. For example, the following should work:
$ which man > man_location.txt
$ cat man_location.txt
/usr/bin/man
This may sound more challenging than it actually is. Did you figure it out?
The provided code uses a Makefile to help build the code. All you need to do to build the program is run
the make command within the source directory:
~/src/pssh$ ls
builtin.c Makefile parse.h
builtin.h parse.c pssh.c
~/src/pssh$ make
gcc -g -Wall -c builtin.c -o builtin.o
gcc -g -Wall -c parse.c -o parse.o
gcc -g -Wall -c pssh.c -o pssh.o
gcc builtin.o parse.o pssh.o -Wall -lreadline -o pssh
~/src/pssh$ ls
builtin.c Makefile parse.o pssh.o
builtin.h parse.c pssh
builtin.o parse.h pssh.c
and just like that, all of the necessary steps to produce the pssh executable will be automatically ran. If you
want to delete all of the intermediate object files and the pssh executable, simply run make clean within
the source directory:
~/src/pssh$ ls
builtin.c Makefile parse.o pssh.o
builtin.h parse.c pssh
builtin.o parse.h pssh.c
~/src/pssh$ make clean
rm -f *.o
rm -f pssh
~/src/pssh$ ls
builtin.c Makefile parse.h
builtin.h parse.c pssh.c
4
Once you have implemented all of the required features described in this document submit your code by
doing the following:
• Run make clean in your source directory. We must be able to built your shell from source and we
don’t want your precompiled executables of intermediate object files. If your code does not at the
very least compile, you will receive a zero.
• Zip up your code.
• Name your zip file abc123 pssh.zip, where abc123 is your Drexel ID.
• Upload your zip file using the Blackboard Learn submission link found on the course website.
Failure to follow these simple steps will result in your project not being graded.
Now, have fun!
Due in 2 Weeks: Thursday, Feb 1st before midnight.
5
- The Big Idea
The Parser
Executing Single Commands
Executing Multiple Pipelined Commands
Adding the Built-in Commands
Building the Code
Submitting Your Project
#ifndef _builtin_h_
#define _builtin_h_
#include “parse.h”
int is_builtin (char* cmd);
void builtin_execute (Task T);
int builtin_which (Task T);
#endif /* _builtin_h_ */
#include
#include
#include
#include
#include “builtin.h”
#include “parse.h”
static char* builtin[] = {
“exit”, /* exits the shell */
“which”, /* displays full path to command */
NULL
};
int is_builtin (char* cmd)
{
int i;
for (i=0; builtin[i]; i++) {
if (!strcmp (cmd, builtin[i]))
return 1;
}
return 0;
}
void builtin_execute (Task T)
{
if (!strcmp (T.cmd, “exit”)) {
exit (EXIT_SUCCESS);
}
else {
printf (“pssh: builtin command: %s (not implemented!)\n”, T.cmd);
}
}
#ifndef _parse_h_
#define _parse_h_
#include
char* cmd;
char** argv; /* NULL terminated array of strings */
} Task;
typedef struct {
Task* tasks; /* ordered list of tasks to pipe */
int ntasks; /* # of tasks in the parse */
char* infile; /* filename of ‘infile’ */
char* outfile; /* filename of ‘outfile’ */
int background; /* run process in background? */
int invalid_syntax; /* parse failed */
} Parse;
Parse* parse_cmdline (char* cmdline);
void parse_destroy (Parse** P);
void parse_debug (Parse* P);
#endif /* _parse_h_ */
/* Author: James A. Shackleford
* Date: January 16th, 2018
*
* Last Updated:
* January 29th, 2018 — properly support quoted arguments
* January 26th, 2018 — argv[] easier to use with exec()
*
* A simple shell command parser. If you are familiar with
* using bash, zsh, tcsh, etc. then you already understand
* what this does.
*
* Parses the following syntax:
*
* ~$ command_1 [< infile] [| command_n]* [> outfile] [&]
*
* and produces a correspondingly populated Parse structure on the heap
*
* Note:
* – Items in brackets [ ] are optional
* – Items in starred brackets [ ]* are optional but can be repeated
* – Non-bracketed items are required
*
* Examples of valid syntax:
*
* ~$ echo “foo!!!!!!!” > foo.txt
* ~$ wc -l < somefile.txt > numlines.txt
* ~$ ls -lh | grep 8.*K | wc -l
* ~$ gvim &
**********************************************************************/
#include
#include
#include
#include
#include “parse.h”
typedef struct {
char* cmd;
char** argv;
char* input_fn;
char* output_fn;
} Unit;
static char ops[] = {‘>’, ‘<', '|', '\0'};
static void trim (char* s)
{
size_t start, end;
if (!s)
return;
end = strlen (s);
for (start=0; isspace (s[start]); ++start);
if (s[start]) {
while (end > 0 && isspace(s[end-1]))
–end;
memmove(s, &s[start], end-start);
}
s[end-start] = ‘\0’;
}
static int is_background (char* cmdline)
{
size_t last;
int ret = 0;
trim (cmdline);
last = strlen (cmdline) – 1;
if (‘&’ == cmdline[last]) {
cmdline[last] = ‘\0’;
ret = 1;
}
return ret;
}
static int is_empty (char* cmdline)
{
trim (cmdline);
if (!cmdline || !strlen(cmdline))
return 1;
return 0;
}
static int is_op (char c)
{
int i;
for (i=0; ops[i]; i++)
if (c == ops[i])
return 1;
return 0;
}
static int has_trailing (char needle, char* haystack)
{
trim (haystack);
if (haystack[0] == needle ||
haystack[strlen(haystack)-1] == needle) {
return 1;
}
return 0;
}
static unsigned int count_char (char needle, char* haystack)
{
unsigned int c = 0;
while (*haystack)
if (*haystack++ == needle)
c++;
return c;
}
static unsigned int count_args (char* unit)
{
unsigned int n = 0;
for (; *unit; unit++) {
if (isspace((unsigned char)(*unit))) {
unit++;
if (*unit == ‘\”‘) {
do {
unit++;
} while (*unit != ‘\”‘);
n++;
} else if (*unit == ‘\”) {
do {
unit++;
} while (*unit != ‘\”);
n++;
} else if (!isspace((unsigned char)(*unit))) {
n++;
}
}
}
return n;
}
static int valid_syntax (Parse* P, Unit* U, int i)
{
if (!U)
return 0;
if (U->input_fn && ((i != 0) || is_empty(U->input_fn)))
return 0;
if (U->output_fn && ((i != P->ntasks-1) || is_empty(U->output_fn)))
return 0;
if (!U->cmd || !*U->cmd)
return 0;
return 1;
}
static char* parse_unary (char op, char* unit)
{
char *start, *end, *arg;
for (start=NULL; *unit; unit++)
if (*unit == op) {
start = ++unit;
break;
}
if (!start)
return NULL;
for (end=start; *end; end++)
if (is_op(*end))
break;
arg = strndup (start, end – start);
arg[end – start] = ‘\0’;
trim (arg);
start–;
memset (start, ‘ ‘, end – start);
return arg;
}
static char* argtok (char* str, char** state)
{
char* ret;
if (!str)
str = *state;
str += strspn (str, ” \”\'”);
if (!*str)
return NULL;
ret = str;
if (*(str-1) == ‘\”‘)
str += strcspn (str, “\””);
else if (*(str-1) == ‘\”)
str += strcspn (str, “\'”);
else
str += strcspn (str, ” “);
if (*str)
*str++ = ‘\0’;
*state = str;
return ret;
}
static void parse_command (Unit* U, char* unit)
{
unsigned int argc, n;
char *str, *token, *state;
trim (unit);
argc = count_args (unit)+1; /* +1 for command */
U->argv = malloc ((argc+1) * sizeof(*U->argv));
U->argv[argc] = NULL;
for (n=0, str=unit; ; n++, str=NULL) {
token = argtok (str, &state);
if (!token)
break;
U->argv[n] = strdup (token);
}
U->cmd = U->argv[0];
}
static Unit* parse_unit (char* unit)
{
Unit* U;
int infiles = count_char (‘<', unit);
int outfiles = count_char ('>‘, unit);
if (infiles > 1 || outfiles > 1)
return NULL;
if (count_char (‘\”, unit) % 2)
return NULL;
if (count_char (‘\”‘, unit) % 2)
return NULL;
U = malloc (sizeof(*U));
U->cmd = NULL;
U->argv = NULL;
if (infiles)
U->input_fn = parse_unary (‘<', unit);
else
U->input_fn = NULL;
if (outfiles)
U->output_fn = parse_unary (‘>’, unit);
else
U->output_fn = NULL;
parse_command (U, unit);
return U;
}
static void unit_destroy (Unit** U)
{
int i;
if (!*U)
return;
if ((*U)->input_fn)
free ((*U)->input_fn);
if ((*U)->output_fn)
free ((*U)->output_fn);
if ((*U)->argv) {
for (i=0; (*U)->argv[i]; i++)
free ((*U)->argv[i]);
free ((*U)->argv);
}
free (*U);
*U = NULL;
}
static void parse_add_unit (Parse* P, Unit* U, int i)
{
if (!valid_syntax (P, U, i)) {
P->invalid_syntax = 1;
goto out;
}
P->tasks[i].cmd = U->cmd;
if (U->argv) {
P->tasks[i].argv = U->argv;
U->argv = NULL;
}
if (U->input_fn && (i == 0)) {
P->infile = U->input_fn;
U->input_fn = NULL;
}
if (U->output_fn && (i == P->ntasks-1)) {
P->outfile = U->output_fn;
U->output_fn = NULL;
}
out:
unit_destroy (&U);
}
static Parse* parse_new ()
{
Parse* P = malloc (sizeof(*P));
P->tasks = NULL;
P->ntasks = 0;
P->infile = NULL;
P->outfile = NULL;
P->background = 0;
P->invalid_syntax = 0;
return P;
}
static void parse_init (Parse* P, char* cmdline)
{
P->background = is_background (cmdline);
if (count_char (‘&’, cmdline)) {
P->invalid_syntax = 1;
return;
}
if (has_trailing (‘|’, cmdline)) {
P->invalid_syntax = 1;
return;
}
P->ntasks = count_char (‘|’, cmdline) + 1;
P->tasks = malloc (P->ntasks * sizeof (*P->tasks));
memset (P->tasks, 0, P->ntasks * sizeof (*P->tasks));
}
void parse_destroy (Parse** P)
{
int i, j;
if (!*P)
return;
if ((*P)->infile)
free ((*P)->infile);
if ((*P)->outfile)
free ((*P)->outfile);
if ((*P)->tasks) {
for (i=0; i<(*P)->ntasks; i++) {
if ((*P)->tasks[i].argv) {
for (j=0; (*P)->tasks[i].argv[j]; j++)
free ((*P)->tasks[i].argv[j]);
free ((*P)->tasks[i].argv);
}
}
free ((*P)->tasks);
}
free (*P);
*P = NULL;
}
Parse* parse_cmdline (char* cmdline)
{
char *str, *token, *state;
int i;
Unit* U;
Parse* P;
if (is_empty (cmdline))
return NULL;
P = parse_new ();
parse_init (P, cmdline);
for (i=0, str=cmdline; !P->invalid_syntax; i++, str=NULL) {
token = strtok_r (str, “|”, &state);
if (!token)
break;
U = parse_unit (token);
parse_add_unit (P, U, i);
}
return P;
}
void parse_debug (Parse* P)
{
int i, j;
fprintf (stderr, “==[ DEBUG: PARSE ]==================================\n”);
fprintf (stderr, “Run in Background? %s\n”, P->background ? “Yes” : “No”);
if (P->infile)
fprintf (stderr, “infile: %s\n”, P->infile);
if (P->outfile)
fprintf (stderr, “outfile: %s\n”, P->outfile);
fprintf (stderr, “ntasks: %i\n”, P->ntasks);
for (i=0; i
fprintf (stderr, “Task %i\n”, i);
fprintf (stderr, ” – cmd: [%s]\n”, P->tasks[i].cmd);
if (P->tasks[i].argv)
for (j=0; P->tasks[i].argv[j]; j++)
fprintf (stderr, ” + arg[%i]: [%s]\n”, j, P->tasks[i].argv[j]);
}
fprintf (stderr, “==================================[ DEBUG: PARSE ]==\n”);
}
#include
#include
#include
#include
#include
#include “builtin.h”
#include “parse.h”
/*******************************************
* Set to 1 to view the command line parse *
*******************************************/
#define DEBUG_PARSE 0
void print_banner ()
{
printf (” ________ \n”);
printf (“_________________________ /_ \n”);
printf (“___ __ \\_ ___/_ ___/_ __ \\ \n”);
printf (“__ /_/ /(__ )_(__ )_ / / / \n”);
printf (“_ .___//____/ /____/ /_/ /_/ \n”);
printf (“/_/ Type ‘exit’ or ctrl+c to quit\n\n”);
}
/* returns a string for building the prompt
*
* Note:
* If you modify this function to return a string on the heap,
* be sure to free() it later when appropirate! */
static char* build_prompt ()
{
return “$ “;
}
/* return true if command is found, either:
* – a valid fully qualified path was supplied to an existing file
* – the executable file was found in the system’s PATH
* false is returned otherwise */
static int command_found (const char* cmd)
{
char* dir;
char* tmp;
char* PATH;
char* state;
char probe[PATH_MAX];
int ret = 0;
if (access (cmd, F_OK) == 0)
return 1;
PATH = strdup (getenv(“PATH”));
for (tmp=PATH; ; tmp=NULL) {
dir = strtok_r (tmp, “:”, &state);
if (!dir)
break;
strncpy (probe, dir, PATH_MAX);
strncat (probe, “/”, PATH_MAX);
strncat (probe, cmd, PATH_MAX);
if (access (probe, F_OK) == 0) {
ret = 1;
break;
}
}
free (PATH);
return ret;
}
/* Called upon receiving a successful parse.
* This function is responsible for cycling through the
* tasks, and forking, executing, etc as necessary to get
* the job done! */
void execute_tasks (Parse* P)
{
unsigned int t;
for (t = 0; t < P->ntasks; t++) {
if (is_builtin (P->tasks[t].cmd)) {
builtin_execute (P->tasks[t]);
}
else if (command_found (P->tasks[t].cmd)) {
printf (“pssh: found but can’t exec: %s\n”, P->tasks[t].cmd);
}
else {
printf (“pssh: command not found: %s\n”, P->tasks[t].cmd);
break;
}
}
}
int main (int argc, char** argv)
{
char* cmdline;
Parse* P;
print_banner ();
while (1) {
cmdline = readline (build_prompt());
if (!cmdline) /* EOF (ex: ctrl-d) */
exit (EXIT_SUCCESS);
P = parse_cmdline (cmdline);
if (!P)
goto next;
if (P->invalid_syntax) {
printf (“pssh: invalid syntax\n”);
goto next;
}
#if DEBUG_PARSE
parse_debug (P);
#endif
execute_tasks (P);
next:
parse_destroy (&P);
free(cmdline);
}
}