- Prompt: Displays a prompt when waiting for a new command.
- History: Keeps a history of commands.
- Pipes: Supports pipes (
|
) to connect multiple commands. - Redirections: Handles redirections (
>
,>>
,<
,<<
) seamlessly. - Environment Variables: Expands environment variables (
$HOME
,$PATH
, etc.). - Error Codes: Tracks and returns error codes (
$?
) for executed commands. - Builtins: Implements internal commands:
echo
(with-n
flag)cd
(change directory)pwd
(print working directory)export
(set environment variables)unset
(unset environment variables)env
(display environment variables)exit
(terminate the shell)
- Signals: Properly handles signals such as:
Ctrl+C
(interrupts the current command and displays a new prompt)Ctrl+D
(exits the shell if pressed at an empty prompt)Ctrl+\
(ignored in interactive mode).
The parsing stage is the backbone of the shell, as it transforms the raw input into a structured format for execution. It consists of two main steps: tokenization and semantic analysis.
The tokenization process splits the user's input into tokens, which are the smallest meaningful units of the command. These tokens include:
- Commands (e.g.,
ls
,cat
,grep
) - Arguments (e.g.,
-l
,file.txt
) - Redirections (
>
,>>
,<
,<<
) - Pipes (
|
)
- Identify the components: The input string is traversed character by character to identify commands, arguments, redirections, and special characters like
|
. - Handle quotes: Properly handles single (
'
) and double ("
) quotes to preserve spaces or special characters inside quoted strings.- Single quotes prevent all expansions.
- Double quotes allow variable expansions (
$
).
- Ignore whitespace: Skips unnecessary spaces while separating meaningful components.
- Classify tokens: Assigns a type to each token (e.g.,
TOKEN_COMMAND
,TOKEN_ARGUMENT
,TOKEN_PIPE
, etc.).
Input:
echo "hello world" | grep hello > output.txt
Tokens generated:
[TOKEN_COMMAND: echo]
[TOKEN_ARGUMENT: "hello world"]
[TOKEN_PIPE: |]
[TOKEN_COMMAND: grep]
[TOKEN_ARGUMENT: hello]
[TOKEN_REDIRECTION_OUT: >]
[TOKEN_ARGUMENT: output.txt]
The semantic analysis phase transforms the list of tokens into a hierarchy of commands and redirections while validating the syntax. This ensures that the input is both logical and executable.
- Validate Syntax:
- Detect invalid sequences like
| |
,; ;
, or missing arguments for redirections (>
,<
). - Example:
echo |
will raise an error.
- Detect invalid sequences like
- Group Tokens:
- Constructs a command tree where each node represents a command or redirection.
- Associates redirections (
>
,<
, etc.) with their respective commands.
- Handle Logical Constructs:
- Maps pipes (
|
) to connect commands in a pipeline.
- Maps pipes (
Input:
cat file.txt | grep "hello" > result.txt
Command structure:
Command 1:
- Executable: cat
- Arguments: [file.txt]
Pipe to Command 2:
Command 2:
- Executable: grep
- Arguments: ["hello"]
- Redirection: > result.txt
Once the input is parsed and analyzed, the shell proceeds to the execution phase. Commands are executed in separate processes, with builtins handled directly in the shell process.
- Forking:
- Creates a new process using
fork()
for each external command. - The parent process waits for the child process to complete using
waitpid()
.
- Creates a new process using
- Piping:
- Sets up pipes (
pipe()
) to connect the output of one command to the input of the next. - Uses
dup2()
to duplicate file descriptors for standard input/output redirection.
- Sets up pipes (
- Redirections:
- Opens file descriptors for input/output redirections (
<
,>
,>>
,<<
). - Uses
dup2()
to redirect standard input/output to the appropriate file descriptors.
- Opens file descriptors for input/output redirections (
- Builtin Commands:
- Executes builtins directly without forking to avoid unnecessary processes.
- Builtins are handled by specific functions within the shell process.
# Compile:
make
# Start shell:
./minishell
# Start shell with test (Norm and Valgrind):
make test
⊹ ࣪ ﹏𓊝﹏𓂁﹏⊹ ࣪ ˖
At 42 School, most projects must comply with the Norm.