Mastering Strace tool: A Practical Guide
we’ll dive deep into `strace`, uncovering how it works, its role in the Linux ecosystem, and how you can wield it to debug and analyze applications effectively. We will also focus on how to interpret its output, so you can make the most of this tool
Introduction
Linux
is a powerful and flexible operating system that developers around the world value for its openness, reliability, and ability to be customized. One of its standout tools is strace
, an essential utility for debugging and gaining insights into how programs interact with the system.
In this blog post, we’ll dive deep into strace
, uncovering how it works, its role in the Linux ecosystem, and how you can wield it to debug and analyze applications effectively. We will also focus on how to interpret its output, so you can make the most of this tool.
What is the Linux System?
Linux is a Unix-like operating system kernel that forms the backbone of countless distributions, from Ubuntu to Fedora. It manages hardware resources, provides an interface for applications to run, and facilitates multitasking and networking.
Why Developers Love Linux
- Transparency: Linux is open-source, allowing anyone to inspect, modify, and learn from its code.
- Stability: Renowned for its reliability in both desktop and server environments.
- Customization: From the kernel to the desktop environment, everything is customizable.
- Community: A robust community and wealth of resources make Linux approachable for beginners and experts alike.
System Calls in Linux: Categories and Main Methods
Before we dive into strace
, let’s first understand system calls
. These are the bridge between user applications and the kernel. When an application needs to perform a task that requires hardware access or privileged operations, it uses system calls
So in simple words what is a system call
is a way for programs to ask the operating system to do something for them. When a program makes a system call, it switches from regular user mode to a more powerful mode called kernel mode. This allows the program to request services, like reading a file or creating a new process, from the operating system.
System calls are grouped into five main categories:
- Process Control
- File Management
- Device Management
- Information Maintenance
- Communication
Let’s explore each category in detail and list the main system calls related to them.
1. Process Control
Process control system calls manage the execution of processes. These system calls enable the creation, termination, and manipulation of processes and threads. The operating system provides these services to manage the execution lifecycle of a process.
Main System Calls in Process Control:
fork()
: Creates a new process by duplicating the calling process. The new process is a child of the calling process.- Return Value: The process ID of the child is returned to the parent, and
0
is returned to the child process.
- Return Value: The process ID of the child is returned to the parent, and
exec()
: Replaces the current process’s image with a new program. It is used afterfork()
to run a different program in the child process.- Variants:
execv()
,execp()
,execl()
, etc.
- Variants:
wait()
: Makes the parent process wait for the termination of a child process. It returns the exit status of the terminated child.exit()
: Terminates the calling process and returns an exit status to the parent process.getpid()
: Returns the process ID (PID) of the calling process.getppid()
: Returns the parent process ID (PPID) of the calling process.kill()
: Sends a signal to a process, which can terminate it or trigger other actions based on the signal.
Explanation:
- Process Creation: The
fork()
system call is fundamental to process creation. It creates a new child process which is a copy of the parent process, and then you can useexec()
to replace the child’s image with a different program. - Process Termination: After the child process completes its task, it exits, and the parent can capture the exit status using
wait()
. - Signal Handling: The
kill()
system call sends a signal to a process, allowing communication or termination of processes.
2. File Management
File management system calls allow processes to interact with files. These system calls help manage file access, reading, writing, and other file-related operations.
Main System Calls in File Management:
open()
: Opens a file for reading, writing, or both. It returns a file descriptor used for further operations on the file.- Flags:
O_RDONLY
,O_WRONLY
,O_RDWR
,O_CREAT
,O_TRUNC
, etc.
- Flags:
read()
: Reads data from a file. It takes a file descriptor and stores the data into a buffer.write()
: Writes data to a file. It takes a file descriptor and the data to be written.close()
: Closes an open file descriptor, releasing the resources associated with it.lseek()
: Changes the file offset for the next read/write operation. This is used to navigate through large files.stat()
: Retrieves information about a file, such as its size, permissions, and timestamps.unlink()
: Deletes a file. It removes the file from the filesystem.
Explanation:
- File Operations:
open()
,read()
,write()
, andclose()
are the core system calls for interacting with files. - File Information:
stat()
provides metadata about a file, whilelseek()
allows for precise file navigation. - File Deletion:
unlink()
removes a file from the filesystem, freeing up space.
3. Device Management
Device management system calls handle interactions with hardware devices, including input/output devices such as disks, terminals, and network interfaces.
Main System Calls in Device Management:
ioctl()
: Provides a way for applications to communicate with device drivers, allowing them to control hardware directly. This call can perform various device-specific operations.read()
andwrite()
: These system calls are also used for device I/O, where a device is treated as a file.mmap()
: Maps a file or device into memory. This allows for direct memory access to hardware, offering faster I/O operations.
Explanation:
- Device Interaction: Devices in Linux are represented as files, and
read()
andwrite()
are commonly used to interact with them. - Device Control:
ioctl()
allows for device-specific operations that go beyond basic file operations. - Memory Mapping:
mmap()
is often used for devices like graphics cards and for memory-mapped I/O operations.
4. Information Maintenance
Information maintenance system calls help manage and retrieve system information, such as process information, system time, and system configuration.
Main System Calls in Information Maintenance:
gettimeofday()
: Retrieves the current time of day, including the number of seconds and microseconds since the Unix epoch.time()
: Returns the number of seconds since the Unix epoch.uname()
: Provides system information, such as the kernel version, machine architecture, and operating system.sysinfo()
: Retrieves various system statistics, including memory usage, load average, and uptime.getpid()
andgetppid()
: These calls return the process ID and parent process ID, respectively.
Explanation:
- Time:
gettimeofday()
andtime()
are used to fetch the current system time, which is crucial for timestamps and scheduling. - System Info:
uname()
provides details about the system, whilesysinfo()
offers performance statistics.
5. Communication
Communication system calls facilitate the exchange of data between processes, either within the same system or across networked systems.
Main System Calls in Communication:
pipe()
: Creates a unidirectional data channel (pipe) between two processes. One process writes to the pipe, and the other reads from it.socket()
: Creates an endpoint for network communication. This can be used to create TCP/UDP sockets for inter-process communication over the network.connect()
: Establishes a connection to a remote socket (e.g., a network server).send()
andrecv()
: Used for sending and receiving data over a socket connection.shmget()
: Creates a shared memory segment, allowing multiple processes to access the same memory space.msgget()
: Creates a message queue, enabling inter-process communication via messages.
Explanation:
- Inter-process Communication:
pipe()
,shmget()
, andmsgget()
provide ways for processes to share data. Pipes offer communication between parent-child processes, while shared memory allows multiple processes to access the same memory space. - Network Communication:
socket()
,connect()
,send()
, andrecv()
are the core calls for network communication, forming the foundation for client-server interactions.
What is strace
?
strace
is a diagnostic, debugging, and troubleshooting tool for Linux that tracks system calls made by a process. It provides visibility into what your program is doing behind the scenes, offering a real-time or recorded log of system calls and signals.
Think of strace
as a microscope for your program's execution. It reveals the sequence of system calls and signals, providing insight into what your program is doing behind the scenes.
How strace
Works
strace
operates by intercepting and recording the system calls made by a process. A system call is how a program interacts with the operating system to request services such as reading a file, creating a process, or sending network data.
Installing strace
Before using strace
, ensure it’s installed on your system:
sudo apt install strace # For Debian/Ubuntu systems
How to Read strace
Output
Understanding strace
output is critical to using it effectively. The typical format of a system call in the output is as follows:
open("example.txt", O_RDWR) = 3
Breaking Down the Structure
- System Call Name:
open
— This is the name of the system call being invoked. - Arguments:
"example.txt"
: The first argument, typically a string, specifies the file to open.O_RDWR
: The second argument, often a constant, specifies the mode (read/write in this case).
- Return Value:
= 3
: The return value is3
, which is the file descriptor assigned by the kernel for this file. If the call fails, this would instead be-1
with an error code (e.g.,ENOENT
) displayed.
Understanding this structure allows you to trace the behavior of your application step by step.
Example: Tracing cat /dev/null
Run cat /dev/null
with strace
:
strace strace cat /dev/null
Sample Output:
execve("/usr/bin/cat", ["cat", "/dev/null"], 0x7ffd6925a260 /* 64 vars */) = 0
.......
openat(AT_FDCWD, "/dev/null", O_RDONLY) = 3
newfstatat(3, "", {st_mode=S_IFCHR|0666, st_rdev=makedev(0x1, 0x3), ...}, AT_EMPTY_PATH) = 0
fadvise64(3, 0, 0, POSIX_FADV_SEQUENTIAL) = 0
read(3, "", 131072) = 0
....
close(3) = 0
....
exit_group(0)
Detailed Interpretation
execve("/usr/bin/cat", ["cat", "/dev/null"], 0x7ffd6925a260 /* 64 vars */) = 0
execve
: This system call executes the/usr/bin/cat
program, passing the argumentscat
and/dev/null
.["cat", "/dev/null"]
: These are the command-line arguments.= 0
: The execution ofexecve
was successful, returning 0.
openat(AT_FDCWD, "/dev/null", O_RDONLY) = 3
openat
: Opens the file/dev/null
with read-only (O_RDONLY
) access./dev/null
: The special file/dev/null
is a device that discards all data written to it and returns EOF when read.= 3
: The kernel successfully opens/dev/null
and assigns file descriptor 3.
read(3, "", 131072) = 0
read
: Reads from file descriptor 3 (/dev/null
). Since/dev/null
always returns EOF, no data is read.= 0
: Indicates that nothing was read because/dev/null
produces no content.
close(3) = 0
close
: Closes the file descriptor 3 (the open/dev/null
).= 0
: The file descriptor was successfully closed.
exit_group(0)
exit_group
: The process exits with an exit status of 0, indicating that it completed successfully.
Key Flags in strace
strace
offers various options (flags) that allow you to customize its behavior. Below are some of the most useful flags you can use with strace
:
1. -e
(Expression)
Use -e
to specify which system calls to trace. For example, to trace only openat
or open
calls:
strace -e openat ls -l
Output:
openat(AT_FDCWD, "ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "../../libselinux.so.1", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "../../libc.so.6", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "../../libpcre2-8.so.0", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "../../filesystems", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "../../locale-archive", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, ".", O_RDONLY|O_NONBLOCK|O_CLOEXEC|O_DIRECTORY) = 3
Example:
- Trace only
open
oropenat
andread
system calls:
strace -e trace=openat,read ls -l
- Trace all system calls but exclude
openat
:
strace -e \!open ls -l
Since shell interprets !
as a special symbol and thus fails to run the command.
Use quotes '
: strace -e '!open' ls -l
or escape with \
: strace -e \!open ls -l
2. -f
(Follow Forks)
The -f
flag tells strace
to follow child processes created by the traced process. This is useful when debugging programs that create child processes using fork()
or clone()
.
Example:
strace -f bash -c "echo hello; ls /tmp"
The bash shell will separate processes for each command (e.g., echo, ls) and Strace
will trace all system calls made by the bash
another example can be:
strace -f git clone https://github.com/Edmondi-Kacaj/shell-examples.git
- Why?
git
spawns subprocesses for network operations, unpacking objects, and interacting with the filesystem.
3. -p
(Process ID)
The -p
flag allows you to attach strace
to an already running process by specifying its process ID (PID).
Example:
strace -p 12345
This will attach strace
to the process with PID 12345
and start tracing its system calls.
4. -s
(String Size)
By default, strace
will only print the first 32 bytes of any string argument passed to system calls. You can change this limit with the -s
flag to show longer strings.
Example:
- without
-s
strace -e trace=write ls
- Output:
write(1, "image1.jpg miscfile notes.txt "..., 62image1.jpg miscfile notes.txt report.pdf script1.sh test) = 62
- with
-s
strace -e trace=write -s 256 ls
- Output:
write(1, "image1.jpg miscfile notes.txt report.pdf script1.sh test\n", 62image1.jpg miscfile notes.txt report.pdf script1.sh test) = 62
This will trace the write
system calls and display up to 256 bytes of the string arguments for each call.
5. -o
(Output File)
The -o
flag allows you to redirect the output of strace
to a file instead of printing it to the terminal.
Example:
strace -o output.txt ls -l
This will save the trace output to output.txt
.
6. -y
or -yy
(-y
and -yy
options control the level of detail displayed )
Let's we want to spy on ls -l
command (trace only read
):
strace -e trace=read ls -l
Result:
read(3, "...."..., 4096) = 2996
but we don't know which file 3
the system refers too.
The -y
flag is used to display more info about the system call, for example, instead of showing the raw memory address of a file descriptor, it might display the actual filename and -yy
is used to provide even more information.
Example:
strace -y -e trace=read ls -l
Result:
read(3</etc/locale.alias>, "...... "..., 4096) = 2996
As we see now the system is referring to the /etc/locale.alias
Example Walkthroughs
Example 1: Monitoring Network Traffic
If you want to trace only network-related system calls (like socket()
, connect()
, send()
, recv()
) for a process, you can use the following:
strace -e trace=network -p 12345
Output example:
recvmsg(.., {msg_namelen=0}, 0) = -1 EAGAIN (Resource temporarily unavailable)
recvmsg(.., {msg_name=.., msg_namelen=.., msg_iov=[{iov_base="...", iov_len=...}], msg_iovlen=.., msg_controllen=.., msg_flags=0}, 0) = 32
This will show network-related activity of the process with PID 12345
, such as establishing connections and sending/receiving data.
Example 2: Trying to open an file without permission
This example demonstrates using strace
to trace system calls related to file access when attempting to open or create a file without sufficient permissions.
strace -e trace=openat,write -s 128 /bin/touch /etc/apache2/sites-enabled/000-default.conf
Explanation:
-e trace=openat,write
: Filters the trace to show only openat (file access) and write (error messages) system calls.-s 128
: Displays up to128
characters of string data, ensuring longer strings like file paths are fully visible./bin/touch /etc/apache2/sites-enabled/000-default.conf
: The program being traced attempts to create or update a file in a protected directory, in which case I'm trying to update the default file generated byapache
server.
Output
openat(AT_FDCWD, "/etc/apache2/sites-enabled/000-default.conf", O_WRONLY|O_CREAT|O_NOCTTY|O_NONBLOCK, 0666) = -1 EACCES (Permission denied)
.....
write(2, ": Permission denied", 19: Permission denied) = 19
......
Detailed Breakdown:
-
openat
:- Attempts to open or create the file
/etc/apache2/sites-enabled/000-default.conf
. - Fails with
-1 EACCES
, indicating "Permission denied".
- Attempts to open or create the file
-
write
:- Outputs the error message
: Permission denied
to standard error (file descriptor2
). - Writes 19 bytes successfully, showing how the error is communicated to the user.
- Outputs the error message
Conclusion
In this blog, we learned about system calls
and their role in enabling communication between user applications and the kernel. We explored the different categories of system calls and how they help manage processes, files, devices, information, and communication.
We also went deep into understanding the strace
command, which is an invaluable tool for tracing and debugging system calls made by a program. With strace
, we can monitor system calls in real time, filter specific calls, follow child processes, and save trace outputs to files. Mastering strace
gives you the ability to diagnose issues, optimize performance, and gain a deeper understanding of how your programs interact with the Linux operating system.